Informatica Tutorials: May 2011

Wednesday, 25 May 2011

Informatica Experienced Interview Questions- Part4

76.While importing the relational source definition from database, what are the meta data of source U import?

77.Difference between Power mart & Power Center?

78.What kinds of sources and of targets can be used in Informatica?

79.If a sequence generator (with increment of 1) is connected to (say) 3 targets and each target uses the NEXTVAL port, what value will each target get?

80.What do you mean by SQL override?

81.What is a shortcut in Informatica?

82.How does Informatica do variable initialization? Number/String/Date

83.How many different locks are available for repository objects

84.What are the transformations that use cache for performance?

85.What is the use of Forward/Reject rows in Mapping?

86.How many ways you can filter the records?

87.How to delete duplicate records from source database/Flat Files? Can we use post sql to delete these records. In case of flat file, how can you delete duplicates before it starts loading?

88.You are required to perform “bulk loading” using Informatica on Oracle, what action would perform at Informatica + Oracle level for a successful load?

89.What precautions do you need take when you use reusable Sequence generator transformation for concurrent sessions?

90.Is it possible negative increment in Sequence Generator? If yes, how would you accomplish it?

91.Which directory Informatica looks for parameter file and what happens if it is missing when start the session? Does session stop after it starts?

92.Informatica is complaining about the server could not be reached? What steps would you take?

93.You have more five mappings use the same lookup. How can you manage the lookup?

94.What will happen if you copy the mapping from one repository to another repository and if there is no identical source?

95.How can you limit number of running sessions in a workflow?

96.An Aggregate transformation has 4 ports (l sum (col 1), group by col 2, col3), which port should be the output?

97.What is a dynamic lookup and what is the significance of NewLookupRow? How will use them for rejecting duplicate records?

98.If you have more than one pipeline in your mapping how will change the order of load?

99.When you export a workflow from Repository Manager, what does this xml contain? Workflow only?

100. Your session failed and when you try to open a log file, it complains that the session details are not available. How would do trace the error? What log file would you seek for?

101.You want to attach a file as an email attachment from a particular directory using ‘email task’ in Informatica, How will you do it?

102. You have a requirement to alert you of any long running sessions in your workflow. How can you create a workflow that will send you email for sessions running more than 30 minutes. You can use any method, shell script, procedure or Informatica mapping or workflow control?

Tuesday, 24 May 2011

Informatica Real time Scenarios.

Scenario 1: How can we load first and last record from a flat file source to target?

Solution:

Create two pipelines in a mapping

1st pipeline would capture the first record, and 2nd one for last record.

1st Pipeline:

src-> sq-> exp(take a variable port with numeric data type and pass through a output port 'O_Test')->filter(pass if only

O_Test =1)->tgt

2nd pipeline:

src->sq->agg(No group it will pass only last entry)->tgt

In session for 2nd instance of target enable 'Append if Exists' option

Scenario 2: How to find out nth row in flat file...we used to do top N analysis by using rownum & some other functionalities by using rowid when source is table .and my query is how to achieve the same functionalities when my source is flat file?

Solution: In the Mapping designer, go to Mappings-> Parameters and Variables.

Here we have two things - Parameters(constant values passed to the mapping) and variables which are dynamic and can be stored as a metadata for future runs(for example you want to do an incremental load into a table B from table A. So you can define a variable which holds the seqid from source. Before you write the data into target , create an expression and source the seqid from source as input and create a variable Max_seqid as output. Now update this value for each row. when the session finishes informatica saves the last read seqid and you can use this in your source qualifier when you run the mapping next time. Please see Infa doc for setmaxvaribale and setminvariables.

In this case, we have to just make use parameters to find the nth row.

Create a parameter(type) - Last_row_number and select datatype as integer or double.

Now you have to create a parameter file on unix box before you call the workflow.

something like this

echo '[<FOLDERNAME>.WF:<WorkflowName>.ST:<SessionName>]'

count=`wc -l filename`

echo "\$\$MappingVariable="$count

Name the parameter file as <workflowname>.par and copy the complete path of the file name and update the "Parameter filename" field under Properties tab in workflow edit tasks.

You can then use this variable in your mapping wherever you want. Just proceed it with two $$.

Scenario 3: How to create flat file dynamically?

SRC FILE TRGT 1 Trgt 2

---------------- -------------------------------- --------------------------------------

Eid Name Sal Eid Name Sal Eid Name Sal

10 a 100 10 a 100 20 b 100

20 b 100 10 c 200 20 d 300

10 c 200

20 d 300

Solution :

1. Sort the data coming from the source based on EID.

2. Create a variable in an expression transformation that would track the change in EID e.g. in your case if the data is sorted based on EID then it would look like

EID Name SAL

10 a 100

10 c 200

20 b 100

20 d 300

Whenever there is a change in EID the variable would track it

variable1= IIF(EID = PREV_EID, 0, 1)

3. Add a transaction control transformation in the map with a similar condition

IIF(variable1 = 1, TC_COMMIT_BEFORE, TC_CONTINUE_TRANSACTION)

this would create a new file whenever there is a change in the EID value.

4. Add a "filename" port in the target and then pass on a value as per your requirement so that the filenames get generated dynamically as per your requirement.

Scenario 4: I HAVE A SOURCE FILE CONTAINING

1|A,1|B,1|C,1|D,2|A,2|B,3|A,3|B

AND IN TARGET I SHOULD GET LIKE

1|A+B+C+D

2|A+B

3|A+B

Solution:

Follow the logic given below in the expression and you will get your output.

Please ensure that all the ports you mentioned below are variable ports and the incoming data should be sorted by key,data

VarPorts	Assigned	Row1	Row2	Row3	Row4
V_CURNT	KEY	1	1	1	2
V_CURNT_DATA	DATA	a	b	c	d
v_OUT (variable port)	IIF(isnull(v_PREV_DATA) or length(v_PREV_DATA)=5,v_CURNT_DATA,iif(V_CURNT = V_PREV, V_PREV_DATA\|\|'~'\|\|V_CURNT_DATA,NULL)	a	a~b	a~b~c	d
o_OUT	v_OUT	a	a~b	a~b~c	d
V_PREV	V_CURNT	null	1	1	1
V_PREV_DATA	v_OUT	null	a	a~b	a~b~c

And After the expression transformation, you have to add an aggregator tx with group by port as 'key'. this will return the last record with the key.

Hi all the above scenario’s have been taken from informatica communities.Incase any one needs any info about the scenarios discussed then they may contact for clarifications.

Monday, 23 May 2011

Data warehousing Concepts Based Interview Questions

1. What is a data-warehouse?

2. What are Data Marts?

3. What is ER Diagram?

4. What is a Star Schema?

5. What is Dimensional Modelling?

6. What Snow Flake Schema?

7. What are the Different methods of loading Dimension tables?

8. What are Aggregate tables?

9. What is the Difference between OLTP and OLAP?

10. What is ETL?

11. What are the various ETL tools in the Market?

12. What are the various Reporting tools in the Market?

13. What is Fact table?

14. What is a dimension table?

15. What is a lookup table?

16. What is a general purpose scheduling tool? Name some of them?

17. What are modeling tools available in the Market? Name some of them?

18. What is real time data-warehousing?

19. What is data mining?

20. What is Normalization? First Normal Form, Second Normal Form , Third Normal Form?

21. What is ODS?

22. What type of Indexing mechanism do we need to use for a typical

Data warehouse?

23. Which columns go to the fact table and which columns go the dimension table? (My user needs to see <data element<data element broken by <data element<data element>

All elements before broken = Fact Measures

All elements after broken = Dimension Elements

24. What is a level of Granularity of a fact table? What does this signify?(Weekly level summarization there is no need to have Invoice Number in the fact table anymore)

25. How are the Dimension tables designed? De-Normalized, Wide, Short, Use Surrogate Keys, Contain Additional date fields and flags.

26. What are slowly changing dimensions?

27. What are non-additive facts? (Inventory,Account balances in bank)

28. What are conformed dimensions?

29. What is VLDB? (Database is too large to back up in a time frame then it's a VLDB)

30. What are SCD1, SCD2 and SCD3?

Informatica Experienced Interview Questions – part 3

Hi readers. These are the questions which normally I would expect by interviewee to know when i sit in panel. So what i would request my reader’s to start posting your answers to this questions in the discussion forum under informatica technical interview guidance tag and i’ll review them and only valid answers will be kept and rest will be deleted.

51.Can u copy the session in to a different folder or repository?

52.What is tracing level and what are its types?

53.What is a command that used to run a batch?

54.What are the unsupported repository objects for a mapplet?

55.If your workflow is running slow, what is your approach towards performance tuning?

56.What are the types of mapping wizards available in Informatica?

57.After dragging the ports of three sources (Sql server, oracle, Informix) to a single source qualifier, can we map these three ports directly to target?

58.Why we use stored procedure transformation?

59.Which object is required by the debugger to create a valid debug session?

60.Can we use an active transformation after update strategy transformation?

61.Explain how we set the update strategy transformation at the mapping level and at the session level?

62.What is exact use of 'Online' and 'Offline' server connect Options while defining Work flow in Work flow monitor? The system hangs when 'Online' Server connect option. The Informatica is installed on a Personal laptop.

63.What is change data capture?

64.Write a session parameter file which will change the source and targets for every session. i.e different source and targets for each session run ?

65.What are partition points?

66.What are the different threads in DTM process?

67.Can we do ranking on two ports? If yes explain how?

68.What is Transformation?

69.What does stored procedure transformation do in special as compared to other transformation?

70.How do you recognize whether the newly added rows got inserted or updated?

71.What is data cleansing?

72.My flat file’s size is 400 MB and I want to see the data inside the FF with out opening it? How do I do that?

73.Difference between Filter and Router?

74.How do you handle the decimal places when you are importing the flat file?

75.What is the difference between $ & $$ in mapping or parameter file? In which case they are generally used?

Thursday, 19 May 2011

Informatica Experienced Interview Questions - Part2

26.What is Data driven?

27.What is batch? Explain the types of the batches?

28.What are the types of meta data repository stores?

29.Can you use the mapping parameters or variables created in one mapping into another mapping?

30.Why did we use stored procedure in our ETL Application?

31.When we can join tables at the Source qualifier itself, why do we go for joiner transformation?

32.What is the default join operation performed by the look up transformation?

33.What is hash table Informatica?

34.In a joiner transformation, you should specify the table with lesser rows as the master table. Why?

35.Difference between Cached lookup and Un-cached lookup?

36.Explain what DTM does when you start a work flow?

37.Explain what Load Manager does when you start a work flow?

38.In a Sequential batch how do i stop one particular session from running?

39.What are the types of the aggregations available in Informatica?

40.How do I create Indexes after the load process is done?

41.How do we improve the performance of the aggregator transformation?

42.What are the different types of the caches available in Informatica? Explain in detail?

43.What is polling?

44.What are the limitations of the joiner transformation?

45.What is Mapplet?

46.What are active and passive transformations?

47.What are the options in the target session of update strategy transformation?

48.What is a code page? Explain the types of the code pages?

49.What do you mean rank cache?

50.How can you delete duplicate rows with out using Dynamic Lookup? Tell me any other ways using lookup delete the duplicate rows?

Informatica Interview based Scenarios.

Hi readers on daily basis i’ll be posting few scenarios discussed in IT toolbox here in my blog as most of my readers wrote to me regarding few interview based scenario’s i’m posting the same here. In case any one needs any help with these scenarios please feel free to contact me .

Skipped record count:

Scenario 1: My source files as 5, 00,000 records. While fetching its skipping records due to data type and other issues. Finally it fetches only 1, 00,000 records. Through session properties we are considering 100000 as the source count.
But actually we are loosing 400000 records. How can I find the number or records that are skipped?

Solution: OPB_SESS_TASK_LOG there is a count for SRC_FAILED_ROWS

Scenario 2: Please provide different ways to achieve this..
How to normalize the following data:

id date

101 2/4/2008

101 4/4/2008

102 6/4/2008

102 4/4/2008

103 4/4/2008

104 8/4/2008

O/P - should have only one id with the min(date)

How to create a mapping for this ?

1 ---> Using Transformations

2 ---> Using SQ Override method..

Solution: You can use the rank transformation and select rank port for id and group by on date. In properties tab select bottom and number of ranks as '1' .

Scenario 3: My scenario is like I am loading records on a daily basis, target is not a truncate load suppose from source I am loading records like

ID|Name

101|Apple

102|Orange

102|Banana

but in target I am already having target record(ID 102 of 10 records), scenario I need is like I have to delete only Empid 102 of yesterday record and load today record(2 records)

How to achieve this in Informatica?

Solution: You can achieve your goal by taking the Look up on your target table and match on the basis of ID column. Then take an expression after your lookup and add a FLAG column. In that FLAG column check for the NULL value return from Look up. After expression take 2 filters and in one filter pass the records with NULL values and Insert those records into Target.

If the Value is not NULL then you can take a UPDATE strategy and Update the old row with the new one.

Scenario 4:

Informatica Sequence Generator Scenarios

1) I have following set of records

Id | Name

101 | ABC

102 | DEF

101 | AMERICA

103 | AFRICA

102 | JAPAN

103 | CHINA

I need to generate sequence and populate in Target in the following manner

SID | ID | NAME

1 |101 | ABC

2 |101 | AMERICA

1 |102 | DEF

2 |102 | JAPAN

1 |103 | AFRICA

2 |103 | CHINA

How to implement the same in Informatica?

Solution:

1 sort on Id

2 use expression t/f like below

V_cur_id -> v_pre_id

V_pre_id -> i_id -- i_id is input

V_seq_id -> iif(v_cur_id = v_pre_id, v_seq_id+1, 1) --default 0

O_seq_id -> v_seq_id.

Scenario 5 : I have my input as below : ( Using Expression)

O/P :

How do we obtain this using the Informatica?

Solution: first import source, then use a sorter transformation . sort it by ur column, then use a expression.

in expression make this column .

like this

1.column_num(coming from sorter)

2.current_num= check if column_num=previous_num,then add (first_value +1),else 1

3.first_value=current_num.

4.previous_num(new column)= column_num

pass current_num to target.

Wednesday, 18 May 2011

Informatica Experienced Interview Questions – part 1

Difference between Informatica 7x and 8x?
Difference between connected and unconnected lookup transformation in Informatica?
Difference between stop and abort in Informatica?
Difference between Static and Dynamic caches?
What is Persistent Lookup cache? What is its significance?
Difference between and reusable transformation and mapplet?
How the Informatica server sorts the string values in Rank transformation?
Is sorter an active or passive transformation? When do we consider it to be active and passive?
Explain about Informatica server Architecture?
In update strategy Relational table or flat file which gives us more performance? Why?
What are the out put files that the Informatica server creates during running a session?
Can you explain what are error tables in Informatica are and how we do error handling in Informatica?
Difference between constraint base loading and target load plan?
Difference between IIF and DECODE function?
How to import oracle sequence into Informatica?
What is parameter file?
Difference between Normal load and Bulk load?
How u will create header and footer in target using Informatica?
What are the session parameters?
Where does Informatica store rejected data? How do we view them?
What is difference between partitioning of relational target and file targets?
What are mapping parameters and variables in which situation we can use them?
What do you mean by direct loading and Indirect loading in session properties?
How do we implement recovery strategy while running concurrent batches?
Explain the versioning concept in Informatica?

Tuesday, 17 May 2011

Informatica Real time Interview Questions

Hi readers. These are the questions which normally i would expect by interviewee to know when i sit in panel. So what i would request my reader’s to start posting your answers to this questions in the discussion forum under informatica technical interview guidance tag and i’ll review them and only valid answers will be kept and rest will be deleted.

Explain your Project?
What are your Daily routines?
How many mapping have you created all together in your project?
In which account does your Project Fall?
What is your Reporting Hierarchy?
How many Complex Mapping’s have you created? Could you please me the situation for which you have developed that Complex mapping?
What is your Involvement in Performance tuning of your Project?
What is the Schema of your Project? And why did you opt for that particular schema?
What are your Roles in this project?
Can I have one situation which you have adopted by which performance has improved dramatically?
Where you Involved in more than two projects simultaneously?
Do you have any experience in the Production support?
What kinds of Testing have you done on your Project (Unit or Integration or System or UAT)? And Enhancement’s were done after testing?
How many Dimension Table are there in your Project and how are they linked to the fact table?
How do we do the Fact Load?
How did you implement CDC in your project?
How does your Mapping in File to Load look like?
How does your Mapping in Load to Stage look like?
How does your Mapping in Stage to ODS look like?
What is the size of your Data warehouse?
What is your Daily feed size and weekly feed size?
Which Approach (Top down or Bottom Up) was used in building your project?
How do you access your source’s (are they Flat files or Relational)?
Have you developed any Stored Procedure or triggers in this project? How did you use them and in which situation?
Did your Project go live? What are the issues that you have faced while moving your project from the Test Environment to the Production Environment?
What is the biggest Challenge that you encountered in this project?
What is the scheduler tool you have used in this project? How did you schedule jobs using it?