Data WareHousing Interview Questions & Answers: 2008

Friday, November 14, 2008

Informatica Interview Questions And Answers VI

What are the types of groups in Router transformation?
Input group Output group
The designer copies property information from the input ports of the input group to create a set of output ports for each output group.
Two types of output groups:-
User defined groups
Default group.
You can not modify or delete default groups.
What is the default join that source qualifier provides?
Inner equi join.
Define informatica repository?
Infromatica Repository:The informatica repository is at the center of the informatica suite. You create a set of metadata tables within the repository database that the informatica application and tools access. The informatica client and server access the repository to save and retrieve metadata.
Which tool U use to create and manage sessions and batches and to monitor and stop the informatica server?
Informatica Workflow Managar and Informatica Worlflow Monitor
How to define Informatica server?
Informatica server is the main server component in informatica product family..Which is resonsible for reads the data from various source system and tranforms the data according to business rule and loads the data into the target table
Explain the informatica Architecture in detail

informatica server connects source data and target data using native

odbc drivers

again it connect to the repository for running sessions and retriveing metadata information

source------>informatica server--------->target

|

|

REPOSITORY
I was working in SQL server, now i got an golden opertunity to work in INFORMATICA. I have lots of (silly) Questions to build my Career, so pls guide me properly. I will ask lots of questions....

What is the process flow of informatica,
Informatica is a ETL tool.used for the Extraction,Transformation and Loadind of data.This tool is used to Extract the data from different Data Bases and then we can do the required transfermation like data type conversions,doing some aggregations,ordering,filtering and so on.After that we can load the transformed data into our database,which will be used for the Bussiness Decissions.
Can anyone explain about incremental aggregation with an example?
Incremental aggregation is specially used for tune the performance of the aggregator. It captures the change each time (incrementally) you run the transformation and then performs the aggregation function to the changed rows and not to the entire rows. This improves the performance because you are not reading the entire source, each time you run the session.
How can you recognise whether or not the newly added rows in the source are gets insert in the target?
In the Type2 maping we have three options to recognise the newly added rows
Version number
Flagvalue
Effective date Range.
What is difference between dimention table and fact table
and what are different dimention tables and fact tables

In the fact table contain measurable data and less columns and meny rows,

It's contain primarykey

Diffrent types of fact tables:

additive,non additive, semi additive

In the dimensions table contain textual descrption of data and also contain meny columns,less rows

Its contain primary key
How to export mappings to the production environment?
In the designer go to the main menu and one can see the export/import options.

Import the exported mapping in to the production repository with replace options.
When do u use a unconnected lookup and connected lookup....

what is the difference between dynamic and static lookup...y and when do v use these types of lookups ( ie...dynamic and static )
In static lookup cache, you cache all the lookup data at the starting of the session. in dynamic lookup cache, you go and query the database to get the lookup value for each record which needs the lookup. static lookup cache adds to the session run time....but it saves time as informatica does not need to connect to your databse every time it needs to lookup. depending on how many rows in your mapping needs a lookup, you can decide on this...also remember that static lookup eats up space...so remember to select only those columns which are needed
With out using Updatestretagy and sessons options, how we can do the update our target table?
In session properties, There is an option

insert

update

insert as update

update as update

like that

by using this we will easily solve
In realtime which one is better star schema or snowflake star schema

the surrogate will be linked to which columns in the dimension table.
In real time only star schema will implement because it will take less time and surrogate key will there in each and every dimension table in star schema and this surrogate key will assign as foreign key in fact table.
What is the mapping for unit testing in Informatica, are there any other testings in Informatica, and how we will do them as a etl developer.
how do the testing people will do testing are there any specific tools for testing
In informatica there is no method for unit testing. There are two methods to test the mapping.

1. But we have data sampling. set the ata sampling properties for session in workflow manager for specified number of rows and test the mapping.

2. Use the debugger and test the mapping for sample records.

Informatica VII

What are the tasks that source qualifier performs?
Join data originating from same source data base.
Filter records when the informatica server reads source data.
Specify an outer join rather than the default inner join specify sorted records.
Select only distinct values from the source.
Creating custom query to issue a special SELECT statement for the informatica server to read source data.
How to join two tables without using the Joiner Transformation.
Itz possible to join the two or more tables by using source qualifier.But provided the tables should have relationship.
When u drag n drop the tables u will getting the source qualifier for each table.Delete all the source qualifiers.Add a common source qualifier for all.Right click on the source qualifier u will find EDIT click on it.Click on the properties tab,u will find sql query in that u can write ur sqls
If you want to create indexes after the load process which transformation you choose?
Its usually not done in the mapping(transformation) level. Its done in session level. Create a command task which will execute a shell script (if Unix) or any other scripts which contains the create index command. Use this command task in the workflow after the session or else, You can create it with a post session command.
What is the use of incremental aggregation? Explain me in brief with an example.
Its a session option. when the informatica server performs incremental aggr. it passes new source data through the mapping and uses historical chache data to perform new aggregation caluculations incrementaly. for performance we will use it.
can any one comment on
significance of oracle 9i in informatica when compared to oracle 8 or 8i.
I mean how is oracle 9i advantageous when compared to oracle 8 or 8i when used in informatica
it's very easy
Actually oracle 8i not allowed userdefined data types
but 9i allows
and then blob,clob allow only 9i not 8i
and more over list partinition is there in 9i only
What happens if you try to create a shortcut to a non-shared folder?
It only creates a copy of it..
What is metadata reporter?
It is a web based application that enables you to run reports againist repository metadata.
With a meta data reporter,You can access information about U’r repository with out having knowledge of sql,transformation language or underlying tables in the repository.
What is polling?
It displays the updated information about the session in the monitor window. The monitor window displays the status of each session when you poll the informatica server.
How do we estimate the number of partitons that a mapping really requires? Is it dependent on the machine configuration?
It depends upon the informatica version we r using. suppose if we r using informatica 6 it supports only 32 partitions where as informatica 7 supports 64 partitions.
How do you decide whether you need ti do aggregations at database level or at Informatica level?
It depends upon our requirment only.If you have good processing database you can create aggregation table or view at database level else its better to use informatica. Here i'm explaing why we need to use informatica.
what ever it may be informatica is a thrid party tool, so it will take more time to process aggregation compared to the database, but in Informatica an option we called "Incremental aggregation" which will help you to update the current values with current values +new values. No necessary to process entire values again and again. Unless this can be done if nobody deleted that cache files. If that happend total aggregation we need to execute on informatica also.

In database we don't have Incremental aggregation facility.
1.can u explain one critical mapping?
2.performance issue which one is better? whether connected lookup tranformation or unconnected one?
it depends on your data and the type of operation u r doing.
If u need to calculate a value for all the rows or for the maximum rows coming out of the source then go for a connected lookup.
Or,if it is not so then go for unconnectd lookup.
Specially in conditional case like,
we have to get value for a field 'customer' from order tabel or from customer_data table,on the basis of following rule:
If customer_name is null then ,customer=customer_data.ustomer_Id
otherwise
customer=order.customer_name.
so in this case we will go for unconnected lookup
Discuss which is better among incremental load, Normal Load and Bulk load
It depends on the requirement. Otherwise Incremental load which can be better as it takes onle that data which is not available previously on the target.
What is IQD file?
IQD file is nothing but Impromptu Query Definetion,This file is maily used in Cognos Impromptu tool after creating a imr( report) we save the imr as IQD file which is used while creating a cube in power play transformer.In data source type we select Impromptu Query Definetion.
Which objects are required by the debugger to create a valid debug session?
Intially the session should be valid session.
source, target, lookups, expressions should be availble, min 1 break point should be available for debugger to debug your session.
What are the options in the target session of update strategy transsformatioin?
Insert
Delete
Update
Update as update
Update as insert
Update esle insert
Truncate table

Informatica Interview Questions And Answers V

What r the tasks that Loadmanger process will do?
Manages the session and batch scheduling: Whe u start the informatica server the load maneger launches and queries the repository for a list of sessions configured to run on the informatica server.When u configure the session the loadmanager maintains list of list of sessions and session start times.When u sart a session loadmanger fetches the session information from the repository to perform the validations and verifications prior to starting DTM process.
Locking and reading the session: When the informatica server starts a session lodamaager locks the session from the repository.Locking prevents U starting the session again and again.
Reading the parameter file: If the session uses a parameter files,loadmanager reads the parameter file and verifies that the session level parematers are declared in the file
Verifies permission and privelleges: When the sesson starts load manger checks whether or not the user have privelleges to run the session.
Creating log files: Loadmanger creates logfile contains the status of session.
What are the different types of transformation available in informatica. and what are the mostly used ones among them
Mainly there are two types of tranformation.1]Active TransformationAn active transformation can change the number of rows that pass through it from source to target i.e it eliminates rows that do not meet the condition in transformation.2]Passive TransformationA passive transformation does not change the number of rows that pass through it i.e it passes all rows through the transformation.Transformations can be Connected or UnConnected. Connected TransformationConnected transformation is connected to other transformations or directly to target table in the mapping.UnConnected TransformationAn unconnected transformation is not connected to other transformations in the mapping. It is called within another transformation, and returns a value to that transformation.list of Transformations available in Informatica:1 source qualifier Tranformation2..Expression Transformation 3..Filter Transformation 4..Joiner Transformation 5..Lookup Transformation 6..Normalizer Transformation 7..Rank Transformation 8..Router Transformation 9..Sequence Generator Transformation 10..Stored Procedure Transformation 11..Sorter Transformation 12..Update Strategy Transformation .13...Aggregator Transformation 14..XML Source Qualifier Transformation 15..Advanced External Procedure Transformation 16..External Transformation 16.. custom tranformationMostly use of particular tranformation depend upon the requirement.In our project we are mostly using source qualifier ,aggregator,joiner,look up transformation
What are two types of processes that informatica runs the session?
Load manager Process: Starts the session, creates the DTM process, and sends post-session email when the session completes.
The DTM process. Creates threads to initialize the session, read, write, and transform data, and handle pre- and post-session operations.
How can we store previous session logs
ust run the session in time stamp mode then automatically session log will not overwrite current session log.
What is meant by Junk Attribute in Informatica?
Junk Dimension A Dimension is called junk dimension if it contains attribute which are rarely changed ormodified. example In Banking Domain , we can fetch four attributes accounting to a junk dimensions like from the Overall_Transaction_master table tput flag tcmp flag del flag advance flag all these attributes can be a part of a junk dimensions
In a joiner trasformation, you should specify the source with fewer rows as the master source. Why?
Joiner transformation compares each row of the master source against the detail source. The fewer unique rows in the master, the fewer iterations of the join comparison occur, which speeds the join process.
What r the tasks that source qualifier performs?
Join data originating from the same source database. You can join two or more tables with primary-foreign key relationships by linking the sources to one Source Qualifier.
Filter records when the Informatica Server reads source data. If you include a filter condition, the Informatica Server adds a WHERE clause to the default query.
Specify an outer join rather than the default inner join. If you include a user-defined join, the Informatica Server replaces the join information specified by the metadata in the SQL query.
Specify sorted ports. If you specify a number for sorted ports, the Informatica Server adds an ORDER BY clause to the default SQL query.
Select only distinct values from the source. If you choose Select Distinct, the Informatica Server adds a SELECT DISTINCT statement to the default SQL query.
Create a custom query to issue a special SELECT statement for the Informatica Server to read source data. For example, you might use a custom query to perform aggregate calculations or execute a stored procedure.

Informatica Interview Questions And Answers III

What is difference between partioning of relatonal target and partitioning of file targets?
Partition's can be done on both relational and flat files.
Informatica supports following partitions
1.Database partitioning
2.RoundRobin
3.Pass-through
4.Hash-Key partitioning
5.Key Range partitioning
All these are applicable for relational targets.For flat file only database partitioning is not applicable.
Informatica supports Nway partitioning.U can just specify the name of the target file and create the partitions, rest will be taken care by informatica session.
What are partition points?
Partition points mark the thread boundaries in a source pipeline and divide
the pipeline into stages.
What is parameter file?
Parameter file is to define the values for parameters and variables used in a session.A parameter
file is a file created by text editor such as word pad or notepad.
You can define the following values in parameter file:-
Maping parameters
Maping variables
session parameters.
Differences between Normalizer and Normalizer transformation.
Normalizer: It is a transormation mainly using for cobol sources,
it's change the rows into coloums and columns into rows
Normalization:To remove the retundancy and inconsitecy
Which transformation should we use to normalize the COBOL and relational sources?
Normalizer Transformation.
When you drag the COBOL source in to the mapping Designer workspace,the normalizer transformation automatically appears,creating input and output ports for every
column in the source.
Which transformation should u need while using the cobol sources as source defintions?
Normalizer transformaiton which is used to normalize the data.Since cobol sources r oftenly consists of Denormailzed data.
What is the difference between Narmal load and Bulk load?
Normal Load: Normal load will write information to the database log file so that if any recorvery is needed it is will be helpful. when the source file is a text file and loading data to a table,in such cases we should you normal load only, else the session will be failed.
Bulk Mode: Bulk load will not write information to the database log file so that if any recorvery is needed we can't do any thing in such cases.
compartivly Bulk load is pretty faster than normal load.
What are the join types in joiner transformation?
Normal (Default)
Master outer
Detail outer
Full outer.
What is the logic will you implement to laod the data in to one factv from 'n' number of dimension tables.
Noramally evey one use
!)slowly changing diemnsions
2)slowly growing dimensions
After draging the ports of three sources(sql server,oracle,informix) to a single source qualifier, can u map these three ports directly to target?
NO.Unless and until u join those three ports in source qualifier u cannot map them directly.
Can U use the maping parameters or variables created in one maping into another maping?
NO. You might want to use a workflow parameter/variable if you want it to be visible with other mappings/sessions
Can we lookup a table from a source qualifer transformation-unconnected lookup
No. we can't do.
I will explain you why.
1) Unless you assign the output of the source qualifier to another transformation or to target no way it will include the feild in the query.
2) source qualifier don't have any variables feilds to utalize as expression.
If i done any modifications for my table in back end does it reflect in informatca warehouse or maping desginer or source analyzer?
NO. Informatica is not at all concern with back end data base.It displays u all the information that is to be stored in repository.If want to reflect back end changes to informatica screens, again u have to import from back end to informatica by valid connection.And u have to replace the existing files with imported files.
Can you copy the batches?
No
Can you use the maping parameters or variables created in one maping into another maping?
No
Waht are main advantages and purpose of using Normalizer Transformation in Informatica?
Narmalizer Transformation is used mainly with COBOL sources where most of the time data is stored in de-normalized format. Also, Normalizer transformation can be used to create multiple rows from a single row of data
Can any one explain real time complain mappings or complex transformations in Informatica.
Specially in Sales Domain.
Most complex logic we use is denormalization. We dont have any Denormalizer transformation in INformatica. So we will have to use an aggregator followed by an expression. Apart from this, we use most of the complexicity in expression transformation involving lot of nested IIF's and Decode statements...another one is the union tranformation and joiner.
What is Micro Strategy? Why is it used for? Can any one explain in detail about it?
Micro strategy is again an BI tool whicl is a HOLAP... u can create 2 dimensional report and also cubes in here.......basically a reporting tool. IT HAS A FULL RANGE OF REPORTING ON WEB ALSO IN WINDOWS.
What are the real time problems generally come up while doing/running mapping/any transformation?can any body explain with example.
May be you will encounter with connection faliure, other then that i don't think so, cuzserver will handle all the syntex errors and Invalid mappings.

what are the difference between view and materialized view?
Materialized views are schema objects that can be used to summarize, precompute, replicate, and distribute data. E.g. to construct a data warehouse.
A materialized view provides indirect access to table data by storing the results of a query in a separate schema object. Unlike an ordinary view, which does not take up any storage space or contain any data
What are the different threads in DTM process?
Master thread: Creates and manages all other threads
Maping thread: One maping thread will be creates for each session.Fectchs session and maping information.
Pre and post session threads: This will be created to perform pre and post session operations.
Reader thread: One thread will be created for each partition of a source.It reads data from source.
Writer thread: It will be created to load data to the target.
Transformation thread: It will be created to tranform data.
what r the settiings that u use to cofigure the joiner transformation?
Master and detail source
Type of join
Condition of the join
the Joiner transformation supports the following join types, which you set in the Properties tab:
Normal (Default)
Master Outer
Detail Outer
Full Outer
What is difference between maplet and reusable transformation?
Maplet: one or more transformations
Reusable transformation: only one transformation
Define maping and sessions?
Maping: It is a set of source and target definitions linked by transformation objects that define the rules for transformation.
Session : It is a set of instructions that describe how and when to move data from source to targets.
What are the mapping paramaters and maping variables?
Maping parameter represents a constant value that You can define before running a session.A mapping parameter retains the same value throughout the entire session.
When you use the maping parameter ,U declare and use the parameter in a maping or maplet.Then define the value of parameter in a parameter file for the session.
Unlike a mapping parameter,a maping variable represents a value that can change throughout the session.The informatica server saves the value of maping variable to the
repository at the end of session run and uses that value next time you run the session.
What are the tasks that Loadmanger process will do?
Manages the session and batch scheduling: Whe you start the informatica server the load maneger launches and queries the repository for a list of sessions configured to run
on the informatica server.When you configure the session the loadmanager maintains list of list of sessions and session start times.When you sart a session loadmanger fetches the session information from the repository to perform the validations and verifications prior to starting DTM process.
Locking and reading the session: When the informatica server starts a session lodamaager locks the session from the repository.Locking prevents you starting the session again and again.
Reading the parameter file: If the session uses a parameter files,loadmanager reads the parameter file and verifies that the session level parematers are declared in the file
Verifies permission and privelleges: When the sesson starts load manger checks whether or not the user have privelleges to run the session.
Creating log files: Loadmanger creates logfile contains the status of session.

Informatica Interview Questions And Answers IV

What is the difference between stop and abort
stop: _______If the session u want to stop is a part of batch you must stop the batch,
if the batch is part of nested batch, Stop the outer most bacth\
Abort:----
You can issue the abort command , it is similar to stop command except it has 60 second time out .
If the server cannot finish processing and commiting data with in 60 sec
What is the status code?
Status code provides error handling for the informatica server during the session.The stored procedure issues a status code that notifies whether or not stored procedure
completed sucessfully.This value can not seen by the user.It only used by the informatica server to determine whether to continue running the session or stop.
Difference between static cache and dynamic cache?
Static cache:
You can not insert or update the cache.
Dynamic cache:
You can insert rows into the cache as you pass to the target.
Difference between static cache and dynamic cache
Static cache
Dynamic cache
U can not insert or update the cache
U can insert rows into the cache as u pass to the target
The informatic server returns a value from the lookup table or cache when the condition is true.When the condition is not true, informatica server returns the default value for connected transformations and null for unconnected transformations.
The informatic server inserts rows into cache when the condition is false.This indicates that the the row is not in the cache or target table. U can pass these rows to the target table
What is power center repository?
Standalone repository. A repository that functions individually, unrelated and unconnected to other repositories.
Global repository. (PowerCenter only.) The centralized repository in a domain, a group of connected repositories. Each domain can contain one global repository. The global repository can contain common objects to be shared throughout the domain through global shortcuts.
Local repository. (PowerCenter only.) A repository within a domain that is not the global repository. Each local repository in the domain can connect to the global repository and use objects in its shared folders.
What r the joiner caches?
Specifies the directory used to cache master records and the index to these records. By default, the cached files are created in a directory specified by the server variable $PMCacheDir. If you override the directory, make sure the directory exists and contains enough disk space for the cache files. The directory can be a mapped or mounted drive.
In the source, if we also have duplicate records and we have 2 targets, T1- for unique values and T2- only for duplicate values. How do we pass the unique values to T1 and duplicate values to T2 from the source to these 2 different targets in a single mapping?
source--->sq--->exp-->sorter(with enable select distinct check box)--->t1
--->aggregator(with enabling group by and write count
function)--->t2
If u want only duplicates to t2 u can follow this sequence
--->agg(with enable group by write this code decode(count(col),1,1,0))--->Filter(condition is 0)--->t2.
What r the diffrence between joiner transformation and source qualifier transformation?
Source qualifier – Homogeneous source
Joiner – Heterogeneous source
While importing the relational source defintion from database, what are the meta data of source you import?
Source name
Database location
Column names
Datatypes
Key constraints.
What r the unsupported repository objects for a mapplet?
Source definitions. Definitions of database objects (tables, views, synonyms) or files that provide source data.
Target definitions. Definitions of database objects or files that contain the target data.
Multi-dimensional metadata. Target definitions that are configured as cubes and dimensions.
Mappings. A set of source and target definitions along with transformations containing business logic that you build into the transformation. These are the instructions that the Informatica Server uses to transform and move data.
Reusable transformations. Transformations that you can use in multiple mappings.
Mapplets. A set of transformations that you can use in multiple mappings.
Sessions and workflows. Sessions and workflows store information about how and when the Informatica Server moves data. A workflow is a set of instructions that describes how and when to run tasks related to extracting, transforming, and loading data. A session is a type of task that you can put in a workflow. Each session corresponds to a single mapping.
What r the types of metadata that stores in repository?
Source definitions. Definitions of database objects (tables, views, synonyms) or files that provide source data.
Target definitions. Definitions of database objects or files that contain the target data.
Multi-dimensional metadata. Target definitions that are configured as cubes and dimensions.
Mappings. A set of source and target definitions along with transformations containing business logic that you build into the transformation. These are the instructions that the Informatica Server uses to transform and move data.
Reusable transformations. Transformations that you can use in multiple mappings.
Mapplets. A set of transformations that you can use in multiple mappings.
Sessions and workflows. Sessions and workflows store information about how and when the Informatica Server moves data. A workflow is a set of instructions that describes how and when to run tasks related to extracting, transforming, and loading data. A session is a type of task that you can put in a workflow. Each session corresponds to a single mapping.
Suppose session is configured with commit interval of 10,000 rows and source has 50,000 rows. Explain the commit points for Source based commit and Target based commit. Assume appropriate value wherever required.
Source based commit will commit the data into target based on commit interval.so,for every 10,000 rows it will commit into target.
Target based commit will commit the data into target based on buffer size of the target.i.e., it commits the data into target when ever the buffer fills.Let us assume that the buffer size is 6,000.So,for every 6,000 rows it commits the data.
What are the reusable transforamtions?
Reusable transformations can be used in multiple mappings.When you need to incorporate this transformation into maping,U add an instance of it to maping.Later if you change the definition of the transformation ,all instances of it inherit the changes.Since the instance of reusable transforamation is a pointer to that transforamtion,You can change the transforamation in the transformation developer,its instances automatically reflect these changes.This feature can save you great deal of work.
What are the types of maping in Getting Started Wizard?
Simple Pass through maping :
Loads a static fact or dimension table by inserting all rows. Use this mapping when you want to drop all existing data from your table before loading new data.
Slowly Growing target :
Loads a slowly growing fact or dimension table by inserting new rows. Use this mapping to load new data when existing data does not require updates.
What r the types of maping wizards that r to be provided in Informatica?
Simple Pass through
Slowly Growing Target
Slowly Changing the Dimension
Type1
Most recent values
Type2
Full History
Version
Flag
Date
Type3
Current and one previous
What are Dimensions and various types of Dimensions?
Set of level properties that describe a specific aspect of a business, used for analyzing the factual measures of one or more cubes, which use that dimension. Egs. Geography, time, customer and product.
What are the session parameters?
Session parameters are like maping parameters,represent values you might want to change between sessions such as database connections or source files.
Server manager also allows you to create userdefined session parameters.Following are user defined session parameters:-
Database connections
Source file names: use this parameter when you want to change the name or location of
session source file between session runs.
Target file name : Use this parameter when you want to change the name or location of
session target file between session runs.
Reject file name : Use this parameter when you want to change the name or location of
session reject files between session runs.
What is Session and Batches?
Session - A Session Is A set of instructions that tells the Informatica Server How And When To Move Data From Sources To Targets. After creating the session, we can use
either the server manager or the command line program pmcmd to start or stop the session.
Batches - It Provides A Way to Group Sessions For Either Serial Or Parallel
Execution By The Informatica Server.
There Are Two Types Of Batches :
Sequential - Run Session One after the Other.
Concurrent - Run Session At The Same Time.
If a session fails after loading of 10,000 records in to the target.How can u load the records from 10001 th record when u run the session next time in informatica 6.1?
Running the session in recovery mode will work, but the target load type should be normal. If its bulk then recovery wont work as expected
Whats the diff between Informatica powercenter server, repositoryserver and repository?
Repository is a database in which all informatica componets are stored in the form of tables. The reposiitory server controls the repository and maintains the data integrity and Consistency across the repository when multiple users use Informatica. Powercenter Server/Infa Server is responsible for execution of the components (sessions) stored in the repository.
How can you access the remote source into your session?
Relational source: To acess relational source which is situated in a remote place ,u need to configure database connection to the datasource.
FileSource : To access the remote source file you must configure the FTP connection to the host machine before you create the session.
Hetrogenous : When U’r maping contains more than one source type,the server manager creates a hetrogenous session that displays source options for all types.
Difference between Rank and Dense Rank?
Rank:
2<--2nd position 2<--3rd position 4 5 Same Rank is assigned to same totals/numbers. Rank is followed by the Position. Golf game ususally Ranks this way. This is usually a Gold Ranking. Dense Rank: 1 2<--2nd position 2<--3rd position 3 4 Same ranks are assigned to same totals/numbers/names. the next rank follows the serial number. What is rank transformation?where can we use this transformation? Rank transformation is used to find the status.ex if we have one sales table and in this if we find more employees selling the same product and we are in need to find the first 5 0r 10 employee who is selling more products.we can go for rank transformation. In update strategy target table or flat file which gives more performance ? why? Pros: Loading, Sorting, Merging operations will be faster as there is no index concept and Data will be in ASCII mode. Cons: There is no concept of updating existing records in flat file. As there is no indexes, while lookups speed will be lesser. What is a command that used to run a batch? pmcmd is used to start a batch. What r the mapping paramaters and maping variables? Please refer to the documentation for more understanding. Mapping variables have two identities: Start value and Current value Start value = Current value ( when the session starts the execution of the undelying mapping) Start value <> Current value ( while the session is in progress and the variable value changes in one ore more occasions)
Current value at the end of the session is nothing but the start value for the subsequent run of the same session.
How do we estimate the depth of the session scheduling queue? Where do we set the number of maximum concurrent sessions that Informatica can run at a given time?
please be more specific on the first half of the question.
u set the max no of concurrent sessions in the info server.by default its 10. u can set to any no.
Where should you place the flat file to import the flat file defintion to the designer?
Place it in local folder.
Why we use partitioning the session in informatica?
Performance can be improved by processing data in parallel in a single session by creating multiple partitions of the pipeline.
Informatica server can achieve high performance by partitioning the pipleline and performing the extract , transformation, and load for each partition in parallel.
Why we use partitioning the session in informatica?
Partitioning achieves the session performance by reducing the time period of reading the source and loading the data into target.

Informatica Interview Questions And Answers II

What are the types of maping wizards that are to be provided in Informatica?
The Designer provides two mapping wizards to help you create mappings quickly and easily. Both wizards are designed to create mappings for loading and maintaining star
schemas, a series of dimensions related to a central fact table.
Getting Started Wizard. Creates mappings to load static fact and dimension tables, as well as slowly growing dimension tables.
Slowly Changing Dimensions Wizard. Creates mappings to load slowly changing dimension tables based on the amount of historical dimension data you want to keep and the method you choose to handle historical dimension data.
What is the Rankindex in Ranktransformation?
The Designer automatically creates a RANKINDEX port for each Rank transformation. The Informatica Server uses the Rank Index port to store the ranking position for
each record in a group. For example, if you create a Rank transformation that ranks the top 5 salespersons for each quarter, the rank index numbers the salespeople from 1
to 5.
In a filter expression we want to compare one date field with a db2 system field CURRENT DATE.
Our Syntax: datefield = CURRENT DATE (we didn't define it by ports, its a system field ), but this is not valid (PMParser: Missing Operator)..
Can someone help us. Thanks
The db2 date formate is "yyyymmdd" where as sysdate in oracle will give "dd-mm-yy" so conversion of db2 date formate to local database date formate is compulsary. other wise u will get that type of error
What are two modes of data movement in Informatica Server?
The data movement mode depends on whether Informatica Server should process single byte or multi-byte character data. This mode selection can affect the enforcement
of code page relationships and code page validation in the Informatica Client and Server.
a) Unicode - IS allows 2 bytes for each character and uses additional byte for each non-ascii character (such as Japanese characters)
b) ASCII - IS holds all data in a single byte.
The IS data movement mode can be changed in the Informatica Server configuration parameters. This comes into effect once you restart the Informatica Server.
Identifying bottlenecks in various components of Informatica and resolving them.
The best way to find out bottlenecks is writing to flat file and see where the bottle neck is
What r the basic needs to join two sources in a source qualifier?
The both the table should have a common feild with same datatype.
Its not neccessary both should follow primary and foreign relationship. If any relation ship exists that will help u in performance point of view.
Identifying bottlenecks in various components of Informatica and resolving them.
The best way to find out bottlenecks is writing to flat file and see where the bottle neck is
What is aggregate cache in aggregator transforamtion?
The aggregator stores data in the aggregate cache until it completes aggregate calculations.When you run a session that uses an aggregator transformation,the informatica server creates index and data caches in memory to process the transformation.If the informatica server requires more space,it stores overflow values in cache files.
Can u tell me how to go for SCD's and its types.Where do we use them mostly
The "Slowly Changing Dimension" problem is a common one particular to data warehousing. In a nutshell, this applies to cases where the attribute for a record varies over time. We give an example below: Christina is a customer with ABC Inc. She first lived in Chicago, Illinois. So, the original entry in the customer lookup table has the following record: Customer Key Name State 1001 Christina IllinoisAt a later date, she moved to Los Angeles, California on January, 2003. How should ABC Inc. now modify its customer table to reflect this change? This is the "Slowly Changing Dimension" problem. There are in general three ways to solve this type of problem, and they are categorized as follows: In Type 1 Slowly Changing Dimension, the new information simply overwrites the original information. In other words, no history is kept. In our example, recall we originally have the following table: Customer Key Name State 1001 Christina IllinoisAfter Christina moved from Illinois to California, the new information replaces the new record, and we have the following table: Customer Key Name State 1001 Christina CaliforniaAdvantages: - This is the easiest way to handle the Slowly Changing Dimension problem, since there is no need to keep track of the old information. Disadvantages: - All history is lost. By applying this methodology, it is not possible to trace back in history. For example, in this case, the company would not be able to know that Christina lived in Illinois before. Usage: About 50% of the time. When to use Type 1: Type 1 slowly changing dimension should be used when it is not necessary for the data warehouse to keep track of historical changes. In Type 2 Slowly Changing Dimension, a new record is added to the table to represent the new information. Therefore, both the original and the new record will be present. The new record gets its own primary key. In our example, recall we originally have the following table: Customer Key Name State 1001 Christina IllinoisAfter Christina moved from Illinois to California, we add the new information as a new row into the table: Customer Key Name State 1001 Christina Illinois 1005 Christina CaliforniaAdvantages: - This allows us to accurately keep all historical information. Disadvantages: - This will cause the size of the table to grow fast. In cases where the number of rows for the table is very high to start with, storage and performance can become a concern. - This necessarily complicates the ETL process. Usage: About 50% of the time. When to use Type 2: Type 2 slowly changing dimension should be used when it is necessary for the data warehouse to track historical changes. In Type 3 Slowly Changing Dimension, there will be two columns to indicate the particular attribute of interest, one indicating the original value, and one indicating the current value. There will also be a column that indicates when the current value becomes active. In our example, recall we originally have the following table: Customer Key Name State1001 Christina IllinoisTo accomodate Type 3 Slowly Changing Dimension, we will now have the following columns: • Customer Key • Name • Original State • Current State • Effective Date After Christina moved from Illinois to California, the original information gets updated, and we have the following table (assuming the effective date of change is January 15, 2003): Customer Key Name Original State Current State Effective Date 1001 Christina Illinois California 15-JAN-2003Advantages: - This does not increase the size of the table, since new information is updated. - This allows us to keep some part of history. Disadvantages: - Type 3 will not be able to keep all history where an attribute is changed more than once. For example, if Christina later moves to Texas on December 15, 2003, the California information will be lost. Usage: Type 3 is rarely used in actual practice. When to use Type 3: Type III slowly changing dimension should only be used when it is necessary for the data warehouse to track historical changes, and when such changes will only occur for a finite number of time.
What are Target Types on the Server?
Target Types are File, Relational and ERP.
What are Target Options on the Servers?
Target Options for File Target type are FTP File, Loader and MQ.
There are no target options for ERP target type.
Target Options for Relational are Insert, Update (as Update), Update (as Insert), Update (else Insert), Delete, and Truncate Table.
What is the difference between summary filter and detail filter]
Summary filter can be applieid on a group of rows that contain a common value.where as detail filters can be applied on each and every rec of the data base.
Difference between summary filter and details filter?
Summary Filter --- we can apply records group by that contain common values.
Detail Filter --- we can apply to each and every record in a database.
What is the diff b/w STOP & ABORT in INFORMATICA sess level ?
Stop:We can Restart the session
Abort:WE cant restart the session.We should truncate all the pipeline after that start the session

Informatica Interview Questions And Answers I

How to use the unconnected lookup i.e., from where the input has to be taken and the output is linked?
What condition is to be given?
The unconnected lookup is used just like a function call. in an expression output/variable port or any place where an expression is accepted(like condition in update strategy etc..), call the unconnected lookup...something like :LKP.lkp_abc(input_port).......(lkp_abc is the name of the unconnected lookup...(plz check the exact syntax)).....give the input value just like we pass parameters to functions, and it'll return the output after looking up.

What r the circumstances that infromatica server results an unreciverable session?
The source qualifier transformation does not use sorted ports.
If u change the partition information after the initial session fails.
Perform recovery is disabled in the informatica server configuration.
If the sources or targets changes after initial session fails.
If the maping consists of sequence generator or normalizer transformation.
If a concuurent batche contains multiple failed sessions.

What is power center repository?
The PowerCenter repository allows you to share metadata across repositories to create a data mart domain. In a data mart domain, yoYou can create a single global
repository to store metadata used across an enterprise, and a number of local repositories to share the global metadata as needed.
Two relational tables are connected to SQ Trans,what are the possible errors it will be thrown?
The only two possibilities as of I know is
Both the table should have primary key/foreign key relation ship
Both the table should be available in the same schema or same database

Which transformation should we use to normalize the COBOL and relational sources?
The Normalizer transformation normalizes records from COBOL and relational sources, allowing you to organize the data according to your own needs. A Normalizer transformation can appear anywhere in a data flow when you normalize a relational source. Use a Normalizer transformation instead of the Source Qualifier transformation when you normalize a COBOL source. When you drag a COBOL source into the Mapping Designer workspace, the Normalizer transformation automatically appears, creating input and output ports for every column in the source

What are the Differences between Informatica Power Center versions 6.2 and 7.1, also between Versions 6.2 and 5.1?
The main difference between informatica 5.1 and 6.1 is that in 6.1 they introduce a new thing called repository server and in place of server manager(5.1), they introduce workflow manager and workflow monitor.

What is Load Manager?
The load Manager is the Primary Informatica Server Process. It performs the following tasks:-
Manages session and batch scheduling.
Locks the session and read session properties.
Reads the parameter file.
Expand the server and session variables and parameters.
Verify permissions and privileges.
Validate source and target code pages.
Create the session log file.
Create the Data Transformation Manager which execute the session.

What is the default join that source qualifier provides?
The Joiner transformation supports the following join types, which you set in the Properties tab:
Normal (Default)
Master Outer
Detail Outer
Full Outer

How can we partition a session in Informatica?
he Informatica® PowerCenter® Partitioning option optimizes parallel processing on multiprocessor hardware by providing a thread-based architecture and built-in data partitioning.
GUI-based tools reduce the development effort necessary to create data partitions and streamline ongoing troubleshooting and performance tuning tasks, while ensuring data integrity throughout the execution process. As the amount of data within an organization expands and real-time demand for information grows, the PowerCenter Partitioning option
enables hardware and applications to provide outstanding performance and jointly scale to handle large volumes of data and users.

What is Datadriven?
The Informatica Server follows instructions coded into Update Strategy transformations within the session mapping to determine how to flag rows for insert, delete, update, or reject.
If the mapping for the session contains an Update Strategy transformation, this field is marked Data Driven by default.

What is meant by lookup caches?
The informatica server builds a cache in memory when it processes the first row af a data in a cached look up transformation.It allocates memory for the cache based on the
amount you configure in the transformation or session properties.The informatica server stores condition values in the index cache and output values in the data cache.

Define Informatica Repository?
The Informatica repository is a relational database that stores information, or metadata, used by the Informatica Server and Client tools. Metadata can include information
such as mappings describing how to transform source data, sessions indicating when you want the Informatica Server to perform the transformations, and connect strings
for sources and targets. The repository also stores administrative information such as usernames and passwords, permissions and privileges, and product version. Use repository manager to create the repository.The Repository Manager connects to the repository database and runs the code needed to create the repository tables.Thsea tables stores metadata in specific format the informatica server,client tools use.

Performance tuning in Informatica?
The goal of performance tuning is optimize session performance so sessions run during the available load window for the Informatica Server.Increase the session
performance by following.

The performance of the Informatica Server is related to network connections. Data generally moves across a network at less than 1 MB per second, whereas a local disk
moves data five to twenty times faster. Thus network connections ofteny affect on session performance.So aviod netwrok connections. Flat files: If u’r flat files stored on a machine other than the informatca server, move those files to the machine that consists of informatica server.
Relational datasources: Minimize the connections to sources ,targets and informatica server to improve session performance.Moving target database into server system may improve session performance. Staging areas: If u use staging areas u force informatica server to perform multiple datapasses. Removing of staging areas may improve session performance.
You can run the multiple informatica servers againist the same repository.Distibuting the session load to multiple informatica servers may improve session performance.
Run the informatica server in ASCII datamovement mode improves the session performance.Because ASCII datamovement mode stores a character value in one
byte.Unicode mode takes 2 bytes to store a character.
If a session joins multiple source tables in one Source Qualifier, optimizing the query may improve performance. Also, single table select statements with an ORDER BY or
GROUP BY clause may benefit from optimization such as adding indexes.
We can improve the session performance by configuring the network packet size,which allows data to cross the network at one time.To do this go to server manger ,choose server configure database connections. If u are target consists key constraints and indexes u slow the loading of data.To improve the session performance in this case drop constraints and indexes before u run the session and rebuild them after completion of session. Running a parallel sessions by using concurrent batches will also reduce the time of loading the data.So concurent batches may also increase the session performance.
Partittionig the session improves the session performance by creating multiple connections to sources and targets and loads data in paralel pipe lines. In some cases if a session contains a aggregator transformation ,You can use incremental aggregation to improve session performance. Aviod transformation errors to improve the session performance. If the sessioin containd lookup transformation You can improve the session performance by enabling the look up cache. If U’r session contains filter transformation ,create that filter transformation nearer to the sources or You can use filter condition in source qualifier. Aggreagator,Rank and joiner transformation may oftenly decrease the session performance .Because they must group data before processing it.To improve session performance in this case use sorted ports option.

What is the procedure to write the query to list the highest salary of three employees?
The following is the query to find out the top three salaries
in ORACLE:--(take emp table)
select * from emp e where 3>(select count (*) from emp where
e.sal>emp.sal) order by sal desc.
in SQL Server:-(take emp table)
select top 10 sal from emp

Can i run the mapping with out starting the informatica server?
The difference between cache and uncacheed lookup iswhen you configure the lookup transformation cache lookup it stores all the lookup table data in the cache when the first input record enter into the lookup transformation, in cache lookup the select statement executes only once and compares the values of the input record with the values in the cachebut in uncache lookup the the select statement executes for each input record entering into the lookup transformation and it has to connect to database each time entering the new record

Questions VIII

How many ways you create ports?
Two ways:-
1.Drag the port from another transforamtion
2.Click the add buttion on the ports tab.

How many ways you can update a relational source defintion and what are they?
Two ways:-
1. Edit the definition
2. Reimport the defintion.
What are the basic needs to join two sources in a source qualifier?
Two sources should have primary and Foreign key relation ships.
Two sources should have matching data types.

What are the different options used to configure the sequential batches?
Two options
Run the session only if previous session completes sucessfully. Always runs the session.

What are the methods for creating reusable transforamtions?
Two methods:-
1.Design it in the transformation developer.
2.Promote a standard transformation from the mapping designer.After you add a transformation to the mapping , You can promote it to the status of reusable transformation.
Once you promote a standard transformation to reusable status,You can demote it to a standard transformation at any time.
If you change the properties of a reusable transformation in mapping,You can revert it to the original reusable transformation properties by clicking the revert button.

What r the active and passive transforamtions?
Transformations can be active or passive. An active transformation can change the number of rows that pass through it, such as a Filter transformation that removes rows that do not meet the filter condition.
A passive transformation does not change the number of rows that pass through it, such as an Expression transformation that performs a calculation on data and passes all rows through the transformation.

What is tracing level and what r the types of tracing level?
Tracing level represents the amount of information that informatcia server writes in a log file.
Types of tracing level:-
Normal
Verbose
Verbose init
Verbose data
Compare Data Warehousing Top-Down approach with Bottom-up approach
Top down
ODS-->ETL-->Datawarehouse-->Datamart-->OLAP
Bottom up
ODS-->ETL-->Datamart-->Datawarehouse-->OLAP

How can u work with remote database in informatica?did u work directly by using remote connections?
To work with remote datasource u need to connect it with remote connections.But it is not preferable to work with that remote source directly by using remote connections .Instead u bring that source into U r local machine where informatica server resides.If u
work directly with remote source the session performance will decreases by passing less amount of data across the network in a particular time.

Why use the lookup transformation?
To perform the following tasks:-
Get a related value. For example, if your source table includes employee ID, but you want to include the employee name in your target table to make your summary data
easier to read.
Perform a calculation. Many normalized tables include values used in a calculation, such as gross sales per invoice or sales tax, but not the calculated value (such as net
sales).
Update slowly changing dimension tables. You can use a Lookup transformation to determine whether records already exist in the target.
In certain mapping there are four targets tg1,tg2,tg3 and tg4.
tg1 has a primary key,tg2 foreign key referencing the tg1's primary key,tg3 has primary key that tg2 and tg4 refers as foreign key,tg2 has foreign key referencing primary key of tg4 ,the order in which the informatica will load the target?

2]How can I detect aggregate tranformation causing low performance?
To optimize the aggregator transformation, you can use the following options.
Use incremental aggregation
Sort the ports before you perform aggregation
Avoid using aggregator transformation after update strategy, since it might be confusing.
Explain use of update strategy transformation
To flag source records as INSERT, DELETE, UPDATE or REJECT for target database. Default flag is Insert. This is must for Incremental Data Loading.

How do you load the time dimension.
ime Dimension will generally load manually by using PL/SQL , shell scripts, proc C etc......
What are the types of data that passes between informatica server and stored procedure?
Three types of data:-
Input/Out put parameters
Return Values
Status code.

What is update strategy transformation?
This transformation is used to maintain the history data or just most recent changes in to target table.

Why sorter transformation is an active transformation?
This is type of active transformation which is responsible for sorting the data either in the ascending order or descending order according to the key specifier. the port on which the sorting takes place is called as sortkeyport properties if u select distinct eliminate duplicates
case sensitive valid for strings to sort the data null treated low null values are given least priority

How to append the records in flat file(Informatica) ? Where as in Datastage we have the options
i) overwrite the existing file
ii) Append existing file
This is not there in Informatica v 7. but heard that its included in the latest version 8.0 where u can append to a flat file. Its about to be shipping in the market.
In which condtions we can not use joiner transformation(Limitaions of joiner transformation)?
This is no longer valid in version 7.2
Now we can use a joiner even if the data is coming from the same source.
If you have four lookup tables in the workflow. How do you troubleshoot to improve performance?
There r many ways to improve the mapping which has multiple lookups.
1) we can create an index for the lookup table if we have permissions(staging area).
2) divide the lookup mapping into two (a) dedicate one for insert means: source - target,, these r new rows . only the new rows will come to mapping and the process will be fast . (b) dedicate the second one to update : source=target,, these r existing rows. only the rows which exists allready will come into the mapping.
3)we can increase the chache size of the lookup.
How many types of dimensions are available in informatica?
There r 3 types of dimensions
1.star schema
2.snowflake schema
3.glaxy schema

Where should U place the flat file to import the flat file defintion to the designer?
There is no such restrication to place the source file. In performance point of view its better to place the file in server local src folder. if you need path please check the server properties availble at workflow manager.
It doesn't mean we should not place in any other folder, if we place in server src folder by default src will be selected at time session creation.

What is difference b/w Informatica 7.1 and Abinitio
There is a lot of diffrence between informatica an Ab Initio
In Ab Initio we r using 3 parllalisim
but Informatica using 1 parllalisim
In Ab Initio no scheduling option we can scheduled manully or pl/sql script
but informatica contains 4 scheduling options
Ab Inition contains co-operating system
but informatica is not
Ramp time is very quickly in Ab Initio campare than Informatica
Ab Initio is userfriendly than Informatica

How many types of facts and what are they?
There are
Factless Facts:Facts without any measures.
Additive Facts:Fact data that can be additive/aggregative.
Non-Additive facts: Facts that are result of non-additon
Semi-Additive Facts: Only few colums data can be added.
Periodic Facts: That stores only one row per transaction that happend over a period of time.
Accumulating Fact: stores row for entire lifetime of event.

Questions VII

why did u use update stategy in your application?
Update Strategy is used to drive the data to be Inert, Update and Delete depending upon some condition. You can do this on session level tooo but there you cannot define any condition.For eg: If you want to do update and insert in one mapping...you will create two flows and will make one as insert and one as update depending upon some condition.Refer : Update Strategy in Transformation Guide for more information

What r the options in the target session of update strategy transsformatioin?
Update as Insert:
This option specified all the update records from source to be flagged as inserts in the target. In other words, instead of updating the records in the target they are inserted as new records.
Update else Insert:
This option enables informatica to flag the records either for update if they are old or insert, if they are new records from source.

How do we do unit testing in informatica?
How do we load data in informatica ?
Unit testing are of two types
1. Quantitaive testing
2.Qualitative testing
Steps.
1.First validate the mapping
2.Create session on themapping and then run workflow.
Once the session is succeeded the right click on session and go for statistics tab.
There you can see how many number of source rows are applied and how many number of rows loaded in to targets and how many number of rows rejected.This is called Quantitative testing.
If once rows are successfully loaded then we will go for qualitative testing.
Steps
1.Take the DATM(DATM means where all business rules are mentioned to the corresponding source columns) and check whether the data is loaded according to the DATM in to target table.If any data is not loaded according to the DATM then go and check in the code and rectify it.
This is called Qualitative testing.
This is what a devloper will do in Unit Testing.

How can u complete unrcoverable sessions?
Under certain circumstances, when a session does not complete, you need to truncate the target tables and run the session from the beginning. Run the session from the
beginning when the Informatica Server cannot run recovery or when running recovery might result in inconsistent data.

What is the difference between connected and unconnected stored procedures.
Unconnected:
The unconnected Stored Procedure transformation is not connected directly to the flow of the mapping. It either runs before or after the session, or is called by an expression in another transformation in the mapping.
connected:
The flow of data through a mapping in connected mode also passes through the Stored Procedure transformation. All data entering the transformation through the input ports affects the stored procedure. You should use a connected Stored Procedure transformation when you need data from an input port sent as an input parameter to the stored procedure, or the results of a stored procedure sent as an output parameter to another transformation.
Can we eliminate duplicate rows by using filter and router transformation ?if so explain me in detail .
U can use SQL query for uniqness if the source is Relational
But if the source is Flat file then u should use Shorter or Aggregatot transformation

Can u start a batches with in a batch?
U can not. If u want to start batch that resides in a batch,create a new independent batch and copy the necessary sessions into the new batch.
in the concept of mapping parameters and variables, the variable value will be saved to the repository after the completion of the session and the next time when u run the session, the server takes the saved variable value in the repository and starts assigning the next value of the saved value. for example i ran a session and in the end it stored a value of 50 to the repository.next time when i run the session, it should start with the value of 70. not with the value of 51.
how to do this.
u can do onething after running the mapping,, in workflow manager
start-------->session.
right clickon the session u will get a menu, in that go for persistant values, there u will find the last value stored in the repository regarding to mapping variable. then remove it and put ur desired one, run the session... i hope ur task will be done

How can U create or import flat file definition in to the warehouse designer?
U can create flat file definition in warehouse designer.in the warehouse designer,u can create new target: select the type as flat file. save it and u can enter various columns for that created target by editing its properties.Once the target is created, save it. u can import it from the mapping designer.

What r the different types of Type2 dimension maping?
Type2
1. Version number
2. Flag
3.Date

What are the mapings that we use for slowly changing dimension table?
Type1: Rows containing changes to existing dimensions are updated in the target by overwriting the existing dimension. In the Type 1 Dimension mapping, all rows contain
current dimension data.
Use the Type 1 Dimension mapping to update a slowly changing dimension table when you do not need to keep any previous versions of dimensions in the table.
Type 2: The Type 2 Dimension Data mapping inserts both new and changed dimensions into the target. Changes are tracked in the target table by versioning the primary
key and creating a version number for each dimension in the table.
Use the Type 2 Dimension/Version Data mapping to update a slowly changing dimension table when you want to keep a full history of dimension data in the table. Version numbers and versioned primary keys track the order of changes to each dimension.
Type 3: The Type 3 Dimension mapping filters source rows based on user-defined comparisons and inserts only those found to be new dimensions to the target. Rows
containing changes to existing dimensions are updated in the target. When updating an existing dimension, the Informatica Server saves existing data in different columns
of the same row and replaces the existing data with the updates.

Questions VI

In which circumstances that informatica server creates Reject files?
When it encounters the DD_Reject in update strategy transformation.
Violates database constraint
Filed in the rows was truncated or overflowed.

How the informatica server sorts the string values in Ranktransformation?
When Informatica Server runs in UNICODE data movement mode ,then it uses the sort order configured in session properties.

What are the joiner caches?
When a Joiner transformation occurs in a session, the Informatica Server reads all the records from the master source and builds index and data caches based on the
master rows.
After building the caches, the Joiner transformation reads records from the detail source and perform joins.
When do u we use dynamic cache and when do we use static cache in an connected and unconnected lookup transformation We use dynamic cache only for connected lookup. We use dynamic cache to check whether the record already exists in the target table are not. And depending on that, we insert,update or delete the records using update strategy. Static cache is the default cache in both connected and unconnected. If u select static cache on lookup table in infa, it own't update the cache and the row in the cache remain constant. We use this to check the results and also to update slowly changing records

What are variable ports and list two situations when they can be used?
We have mainly tree ports Inport, Outport, Variable port. Inport represents data is flowing into transformation. Outport is used when data is mapped to next transformation. Variable port is used when we mathematical caluculations are required. If any addition i will be more than happy if you can share.

How to load time dimension?
We can use SCD Type 1/2/3 to load any Dimensions based on the requirement.
where do we use MQ series source qualifier, application multi group source qualifier. just give an example for a better understanding
We can use a MQSeries SQ when we have a MQ messaging system as source(queue).
When there is need to extract data from a Queue, which will basically have messages in XML format, we will use a JMS or a MQ SQ depending on the messaging system. If you have a TIBCO EMS Queue, use a JMS source and JMS SQ and an XML Parser, or if you have a MQ series queue, then use a MQ SQ which will be associated with a Flat file or a Cobal file.

In a sequential Batch how can we stop single session?
We can stop it using PMCMD command or in the monitor right click on that perticular session and select stop.this will stop the current session and the sessions next to it.

Can you start a session inside a batch idividually?
We can start our required session only in case of sequential batch.in case of concurrent batch we cant do like this.

What is a view? How it is related to data independence?And what are the different types of views,and what is Materialize view views view is a combination of one or more table.view does not stores the data,it just store the query in file format.If we excutes the query the query will fetch the data from the tables and just make it to view for us. Types views materilized view

What are various types of Aggregation?
Various types of aggregation are SUM, AVG, COUNT, MAX, MIN, FIRST, LAST, MEDIAN, PERCENTILE, STDDEV, and VARIANCE.

What is mystery dimention?
using Mystery Dimension ur maitaining the mystery data in ur Project.

what is the look up transformation?
Using it we can access the data from a relational table which is not a source in the mapping.
For Ex:Suppose the source contains only Empno, but we want Empname also in the mapping.Then instead of adding another tbl which contains Empname as a source ,we can Lkp the table and get the Empname in target.

How do you create a mapping using multiple lookup transformation?
Use unconnected lookup if same lookup repeats multiple times.

How can we eliminate duplicate rows from flat file?
Use Sorter Transformation. When you configure the Sorter Transformation to treat output rows as distinct, it configures all ports as part of the sort key. It therefore discards duplicate rows compared during the sort operation

How can you improve session performance in aggregator transformation?
Use sorted input.

What is the look up transformation?
Use lookup transformation in u’r mapping to lookup data in a relational table,view,synonym.
Informatica server queries the look up table based on the lookup ports in the transformation.It compares the lookup transformation port values to lookup table column values based on the look up condition.

How to get two targets T1 containing distinct values and T2 containing duplicate values from one source S1.
Use filter transformation for loading the target with no duplicates. and for the other transformation load it directly from source.

How to delete duplicate rows in flat files source is any option in informatica
Use a sorter transformation , in that u will have a "distinct" option make use of it .
When we create a target as flat file and source as oracle.. how can i specify first rows as column names in flat files...
use a pre sql statement....but this is a hardcoding method...if you change the column names or put in extra columns in the flat file, you will have to change the insert statement

Why did you use stored procedure in your ETL Application?
usage of stored procedure has the following advantages
1checks the status of the target database
2drops and recreates indexes
3determines if enough space exists in the database
4performs aspecilized calculation

Questions V

Why you use repository connectivity?
When you edit,schedule the sesion each time,informatica server directly communicates the repository to check whether or not the session and users are valid.All the metadata of sessions and mappings will be stored in repository.
Briefly explian the Versioning Concept in Power Center 7.1.
When you create a version of a folder referenced by shortcuts, all shortcuts continue to reference their original object in the original version. They do not automatically update to the current folder version.
For example, if you have a shortcut to a source definition in the Marketing folder, version 1.0.0, then you create a new folder version, 1.5.0, the shortcut continues to point to the source definition in version 1.0.0.
Maintaining versions of shared folders can result in shortcuts pointing to different versions of the folder. Though shortcuts to different versions do not affect the server, they might prove more difficult to maintain. To avoid this, you can recreate shortcuts pointing to earlier versions, but this solution is not practical for much-used objects. Therefore, when possible, do not version folders referenced by shortcuts.

What is source qualifier transformation?
When you add a relational or a flat file source definition to a mapping, you need to connect it to a Source Qualifier transformation. The Source Qualifier represents the rows that the Informatica Server reads when it executes a session.
Join data originating from the same source database. You can join two or more tables with primary-foreign key relationships by linking the sources to one Source Qualifier.
Filter records when the Informatica Server reads source data. If you include a filter condition, the Informatica Server adds a WHERE clause to the default query.
Specify an outer join rather than the default inner join. If you include a user-defined join, the Informatica Server replaces the join information specified by the metadata in the SQL query.
Specify sorted ports. If you specify a number for sorted ports, the Informatica Server adds an ORDER BY clause to the default SQL query.
Select only distinct values from the source. If you choose Select Distinct, the Informatica Server adds a SELECT DISTINCT statement to the default SQL query.
Create a custom query to issue a special SELECT statement for the Informatica Server to read source data. For example, you might use a custom query to perform aggregate calculations or execute a stored procedure.

What is a source qualifier?-
When you add a relational or a flat file source definition to a mapping, you need to connect it to a Source Qualifier transformation. The Source Qualifier represents the rows that the Informatica Server reads when it executes a session.

What is source qualifier transformation?
When you add a relational or a flat file source definition to a maping,U need to connect it to a source qualifer transformation.The source qualifier transformation represnets the records that the informatica server reads when it runs a session.

What is incremantal aggregation?
When using incremental aggregation, you apply captured changes in the source to aggregate calculations in a session. If the source changes only incrementally and you
can capture changes, you can configure the session to process only those changes. This allows the Informatica Server to update your target incrementally, rather than
forcing it to process the entire source and recalculate the same calculations each time you run the session.

If you are workflow is running slow in informatica. Where do you start trouble shooting and
what are the steps you follow?
When the work flow is running slowly u have to find out the bottlenecks
in this order target source mapping session system

What is exact use of 'Online' and 'Offline' server connect Options while defining Work flow in Work flow monitor? . The system hangs when 'Online' server connect option. The Informatica is installed on a Personal laptop.
When the repo is up and the PMSERVER is also up, workflow monitor always will be connected on-line.
When PMserver is down and the repo is still up we will be prompted for an off-line connection with which we can just monitor the workflows

Explain about perform recovery?
When the Informatica Server starts a recovery session, it reads the OPB_SRVR_RECOVERY table and notes the row ID of the last row committed to the target database.
The Informatica Server then reads all sources again and starts processing from the next row ID. For example, if the Informatica Server commits 10,000 rows before the
session fails, when you run recovery, the Informatica Server bypasses the rows up to 10,000 and starts loading with row 10,001.
By default, Perform Recovery is disabled in the Informatica Server setup. You must enable Recovery in the Informatica Server setup before you run a session so the
Informatica Server can create and/or write entries in the OPB_SRVR_RECOVERY table.

How the informatica server sorts the string values in Ranktransformation?
When the informatica server runs in the ASCII data movement mode it sorts session data using Binary sortorder.If you configure the seeion to use a binary sort order,the
informatica server caluculates the binary value of each string and returns the specified number of rows with the higest binary values for the string.

Questions IV

What is tracing level?
Ya its the level of information storage in session log.
The option comes in the properties tab of transformations. By default it remains "Normal". Can be Verbose Initialisation Verbose Data Normal or Terse.

Can we run a group of sessions without using workflow manager
ya Its Posible using pmcmd Command with out using the workflow Manager run the group of session. as per my knowledge i give the answer.

How do you create single lookup transformation using multiple tables?
Write a override sql query. Adjust the ports as per the sql query.

How can we join the tables if the tables have no primary and forien key relation and no matchig port to join?
without common column or common data type we can join two sources using dummy ports.
1.Add one dummy port in two sources.
2.In the expression trans assing '1' to each port.
2.Use Joiner transformation to join the sources using dummy port(use join conditions).
hope this will help.

Describe two levels in which update strategy transformation sets?
Within a session: When you configure a session, yoYou can instruct the Informatica Server to either treat all records in the same way (for example, treat all records as inserts), or use instructions coded into the session mapping to flag records for different database operations.
Within a mapping: Within a mapping, you use the Update Strategy transformation to flag records for insert, delete, update, or reject.

How do you handle decimal places while importing a flatfile into informatica?
While importing flat file definetion just specify the scale for a neumaric data type. in the mapping, the flat file source supports only number datatype(no decimal and integer). In the SQ associated with that source will have a data type as decimal for that number port of the source.
source ->number datatype port ->SQ -> decimal datatype.Integer is not supported. hence decimal is taken care.

How do you handle decimal places while importing a flatfile into informatica?
while importing flat file definetion just specify the scale for a neumaric data type. in the mapping, the flat file source supports only number datatype(no decimal and integer). In the SQ associated with that source will have a data type as decimal for that number port of the source.
source ->number datatype port ->SQ -> decimal datatype.Integer is not supported. hence decimal is taken care.

What is parameter file?
When you start a workflow, you can optionally enter the directory and name of a parameter file. The Informatica Server runs the workflow using the parameters in the file you specify.
For UNIX shell users, enclose the parameter file name in single quotes:
-paramfile '$PMRootDir/myfile.txt'
For Windows command prompt users, the parameter file name cannot have beginning or trailing spaces. If the name includes spaces, enclose the file name in double quotes:
-paramfile ”$PMRootDir\my file.txt”
Note: When you write a pmcmd command that includes a parameter file located on another machine, use the backslash (\) with the dollar sign ($). This ensures that the machine where the variable is defined expands the server variable.
pmcmd startworkflow -uv USERNAME -pv PASSWORD -s SALES:6258 -f east -w wSalesAvg -paramfile '\$PMRootDir/myfile.txt'

What is aggregate cache in aggregator transforamtion?
When you run a workflow that uses an Aggregator transformation, the Informatica Server creates index and data caches in memory to process the transformation. If the Informatica Server requires more space, it stores overflow values in cache files.

Questions III

In my source table 1000 rec's r there.I want to load 501 rec to 1000 rec into my Target table ?how can u do this ?
You can overide the sql Query in Wofkflow Manager. LIke
select * from tab_name where rownum<=1000 minus select * from tab_name where rownum<=500; This will work fine. Try it and get back to me if u have any issues about the same.

How to Generate the Metadata Reports in Informatica? You can generate PowerCenter Metadata Reporter from a browser on any workstation, even a workstation that does not have PowerCenter tools installed.

What r the methods for creating reusable transforamtions? You can design using 2 methods using transformation developer create normal one and promote it to reusable

What are the diffrence between joiner transformation and source qualifier transformation? You can join hetrogenious data sources in joiner transformation which we can not achieve in source qualifier transformation. You need matching keys to join two relational sources in source qualifier transformation.Where as you doesn’t need matching keys to join two sources. Two relational sources should come from same datasource in sourcequalifier.You can join relatinal sources which are coming from diffrent sources also. Can any body write a session parameter file which will change the source and targets for every session. i.e different source and targets for each session run. You are supposed to define a parameter file. And then in the Parameter file, you can define two parameters, one for source and one for target. Give like this for example: $Src_file = c:\program files\informatica\server\bin\abc_source.txt $tgt_file = c:\targets\abc_targets.txt Then go and define the parameter file: [folder_name.WF:workflow_name.ST:s_session_name] $Src_file =c:\program files\informatica\server\bin\abc_source.txt $tgt_file = c:\targets\abc_targets.txt If its a relational db, you can even give an overridden sql at the session level...as a parameter. Make sure the sql is in a single line. In a sequential batch can you run the session if previous session fails? Yes.By setting the option always runs the session.

Can you use the maping parameters or variables created in one maping into any other reusable transformation? Yes.Because reusable tranformation is not contained with any maplet or maping.

Can you generate reports in Informatcia? Yes. By using Metadata reporter we can generate reports in informatica. There are 3 depts in dept table and one with 100 people and 2nd with 5 and 3rd with some 30 and so. i want to diplay those deptno where more than 10 people exists Yes! the answer provided is absolutely right. by an SQL application(Oracle). If you want to perform it thru informatica, the Fire the same query in the SQL Override of Source qualifier transformation and make a simple pass thru mapping. Other wise, you can also do it by using a Filter.Router transformation by giving the condition there deptno>=10.

Can you copy the session to a different folder or repository?
Yes. By using copy session wizard You can copy a session in a different folder or repository. But that target folder or repository should consists of mapping of that session.
If target folder or repository is not having the maping of copying session ,
You should have to copy that maping first before you copy the session.

Can Informatica be used as a Cleansing Tool? If Yes, give example of transformations that can implement a data cleansing routine.
Yes, we can use Informatica for cleansing data. some time we use stages to cleansing the data. It depends upon performance again else we can use expression to cleasing data.
For example an feild X have some values and other with Null values and assigned to target feild where target feild is notnull column, inside an expression we can assign space or some constant value to avoid session failure.

The input data is in one format and target is in another format, we can change the format in expression.
we can assign some default values to the target to represent complete set of data in the target.

Can batches be copied/stopped from server manager?
Yes, we can stop the batches using server manager or pmcmd command

Can Informatica load heterogeneous targets from heterogeneous sources?
Yes it can. For example...Flat File and Relations sources are joined in the mapping, and later, Flat File and relational targets are loaded.

Questions II

What is difference between IIF and DECODE function?
You can use nested IIF statements to test multiple conditions. The following example tests for various conditions and returns 0 if sales is zero or negative:
IIF( SALES > 0, IIF( SALES <> 0 and SALES <> 49 AND SALES <> 99 AND SALES <> 199, BONUS)

What are the new features of the server manager in the informatica 5.0?
You can use command line arguments for a session or batch.This allows you to change the values of session parameters,and mapping parameters and maping variables.
Parallel data processing: This feature is available for powercenter only.If we use the informatica server on a SMP system, You can use multiple CPU’s to process a session
concurently.
Process session data using threads: Informatica server runs the session in two processes.

Can we use aggregator/active transformation after update strategy transformation?
You can use aggregator after update strategy. The problem will be, once you perform the update strategy, say you had flagged some rows to be deleted and you had performed aggregator transformation for all rows, say you are using SUM function, then the deleted rows will be subtracted from this aggregator transformation.

what is the best way to show metadata(number of rows at source, target and each transformation level, error related data) in a report format?
You can select these details from the repository table. you can use the view REP_SESS_LOG to get these data

Can you start a batches with in a batch?
You can not. If you want to start batch that resides in a batch,create a new independent batch and copy the necessary sessions into the new batch.

How can you create or import flat file definition in to the warehouse designer?
You can not create or import flat file defintion in to warehouse designer directly.Instead you must analyze the file in source analyzer,then drag it into the warehouse
designer.

When you drag the flat file source defintion into warehouse desginer workspace,the warehouse designer creates a relational target defintion not a file defintion.If you want to load to a file,configure the session to write to a flat file.When the informatica server runs the session,it creates and loads the flat file.

Informatica Questions I

What is target load order?
You specify the target loadorder based on source qualifiers in a maping.If you have the multiple source qualifiers connected to the multiple targets,You can designatethe order in which informatica server loads data into the targets.

How do you configure mapping in informatica?

You should configure the mapping with the least number of transformations and expressions to do the most amount of work possible. You should minimize the amount of data moved by deleting unnecessary links between transformations. For transformations that use data cache (such as Aggregator, Joiner, Rank, and Lookup transformations), limit connected input/output or output ports. Limiting the number of connected input/output or output ports reduces the amount of data the transformations store in the data cache.
You can also perform the following tasks to optimize the mapping:

Configure single-pass reading.
Optimize datatype conversions.
Eliminate transformation errors.
Optimize transformations.
Optimize expressions.

You should configure the mapping with the least number of transformations and expressions to do the most amount of work possible. You should minimize the amount of data moved by deleting unnecessary links between transformations. For transformations that use data cache (such as Aggregator, Joiner, Rank, and Lookup transformations), limit connected input/output or output ports. Limiting the number of connected input/output or output ports reduces the amount of data the transformations store in the data cache.
You can also perform the following tasks to optimize the mapping:

Configure single-pass reading.
Optimize datatype conversions.
Eliminate transformation errors.
Optimize transformations.
Optimize expressions.

When conventional datawarehousing is able to handle and answer complex queries than wat is the need of OLTP and OLAP?

Designed for analysis of business measures by categories and attributes
Optimized for bulk loads and large, complex, unpredictable queries that access many rows per table. Loaded with consistent, valid data; requires no real time validation
Supports few concurrent users relative to OLTP Again, all these are covered in the tutorials, why and how you should select.

How can we join 3 database like Flat File, Oracle, Db2 in Informatrica.?
You have to use two joiner transformations.fIRST one will join two tables and the next one will join the third with the resultant of the first joiner.

How do you transfert the data from data warehouse to flatfile?
You can write a mapping with the flat file as a target using a DUMMY_CONNECTION. A flat file target is built by pulling a source into target space using Warehouse Designer tool.

How can u work with remote database in informatica?did u work directly by using remote connections?
You can work with remote, But you have to Configure FTP Connection details IP address
User authentication

Cluster analysis Question

1. What is cluster analysis? What are the typical applications of cluster analysis? 2 to 8
2. What are the requirements of clustering in data mining? 9 to 13
3. What are the types of data that occur in cluster analysis and the way to
process them for analysis purposes?
4. Explain about interval‐scaled variables in cluster analysis. 18 to 23
5. Explain about binary variables in cluster analysis. 24 to 30
6. Explain about nominal, ordinal, and ratio‐scaled variables in cluster analysis. 31 to 38
7. Explain about mixed‐type variables in cluster analysis. 39 to 41
8. Briefly introduce the major clustering methods 42 to 51
9. Explain the centroid‐based k‐means method 52 to 59
10. Explain the variations of centroid‐based k‐means method 60 to 68
11. Explain about CLARA, a sampling‐based method 69 to 73
12. Write a short note on hierarchical clustering 74
13. Explain agglomerative and divisive hierarchical clustering 75 to 81
14. Explain the BIRCH method of hierarchical clustering 82 to 88
15. Explain about CURE – clustering 89 to 93
16. Explain about ROCK , a hierarchical clustering algorithm 94
17. Explain the dynamic modeling of clustering technique – chameleon 95 to 100
18. Explain about DBSCAN, a density based clustering method 101 to 109
19. Explain about the cluster analysis method – OPTICS 110 to 114
20. Explain about the density based clustering method – DENCLUE 115 to 120
21. Explain about STING, the grid‐based multiresolution clustering technique 121 to 127
22. Write about WaveCluster, a multiresolution clustering algorithm 128 to 130
23. Explain the CLIQUE (CLusering In QUEst) clustering algorithm that integrates
density‐based and grid‐based clustering
24. Explain different model‐bases statistical clustering methods 136 to 148
25. Explain different neural networks‐based clustering methods 149 to 152
26. Write a short note on outlier analysis including the causes of outliers and the
applications of outlier analysis
27. Explain how outliers are detected using the discordancy‐test and its variations 158
28. Write the outlines of algorithms for mining distance‐based outliers and
explain them
29. What is meant by deviation‐based outlier detection and explain the two
techniques of deviation‐based outlier techniques

Classification and prediction Question

1. What are classification and prediction? Explain using suitable examples 2 to 9
2. Explain about the preprocessing steps for preparing data for classification and
prediction
3. List out the criteria for comparing classification and prediction methods 14
4. What is a decision tree? How and why decision trees are used in classification? 15 to
5. Explain the decision tree induction algorithm and illustrate it. 19 to 32
6. Explain about tree pruning 33 to 41
7. Explain the different methods used to make decision tree induction more
scalable
8. What are Bayesian classifiers? 51 and 52
9. Explain the Bayes’ theorem 53 to 56
10. Explain the working of naïve or simple Bayesian classifier with an illustration 57 to 64
11. Explain about Bayesian belief networks 65 to 69
12. Explain the possible scenarios in training or learning in Bayesian belief
networks
13. Write a brief note on neural networks and classification by backpropagation 75 and
14. Explain about backpropagation based learning a on feed forward multilayer
neural network
15. Write a note on designing the topology of a neural network 80 and 81
16. Write and explain the backpropagation algorithm 82 to 92
17. Illustrate the backpropagation algorithm 93 to 95
18. Analyze the backpropagation algorithm in terms of interpretability 96 to 99
19. Explain different methods for applying the concepts of association rule mining
for classification
20. Write about k‐nearest neighborhood classification 110 to 113
21. Write about Case‐based reasoning in classification 114 to 116
22. Write about Genetic algorithms for classification 117 to 119
23. Write about Rough set in classification 120 to 123
24. Write about Fuzzy set approach to classification 124 to 127
25. Explain about prediction of continuous values modeled by linear and multiple
regression
26. Explain about prediction of continuous values modeled by nonlinear regression 27. Write a short on logistic regression and Poisson regression 136 to 138
28. Explain the holdout and cross‐validation techniques for assessing classifier
accuracy
29. Explain the different methods for improving classifier accuracy 144 to 148
30. Explain about sensitivity and specificity measures to judge a classifier 149 to 153