Data WareHousing Interview Questions & Answers: Informatica Interview Questions And Answers IV

What is the difference between stop and abort
stop: _______If the session u want to stop is a part of batch you must stop the batch,
if the batch is part of nested batch, Stop the outer most bacth\
Abort:----
You can issue the abort command , it is similar to stop command except it has 60 second time out .
If the server cannot finish processing and commiting data with in 60 sec
What is the status code?
Status code provides error handling for the informatica server during the session.The stored procedure issues a status code that notifies whether or not stored procedure
completed sucessfully.This value can not seen by the user.It only used by the informatica server to determine whether to continue running the session or stop.
Difference between static cache and dynamic cache?
Static cache:
You can not insert or update the cache.
Dynamic cache:
You can insert rows into the cache as you pass to the target.
Difference between static cache and dynamic cache
Static cache
Dynamic cache
U can not insert or update the cache
U can insert rows into the cache as u pass to the target
The informatic server returns a value from the lookup table or cache when the condition is true.When the condition is not true, informatica server returns the default value for connected transformations and null for unconnected transformations.
The informatic server inserts rows into cache when the condition is false.This indicates that the the row is not in the cache or target table. U can pass these rows to the target table
What is power center repository?
Standalone repository. A repository that functions individually, unrelated and unconnected to other repositories.
Global repository. (PowerCenter only.) The centralized repository in a domain, a group of connected repositories. Each domain can contain one global repository. The global repository can contain common objects to be shared throughout the domain through global shortcuts.
Local repository. (PowerCenter only.) A repository within a domain that is not the global repository. Each local repository in the domain can connect to the global repository and use objects in its shared folders.
What r the joiner caches?
Specifies the directory used to cache master records and the index to these records. By default, the cached files are created in a directory specified by the server variable $PMCacheDir. If you override the directory, make sure the directory exists and contains enough disk space for the cache files. The directory can be a mapped or mounted drive.
In the source, if we also have duplicate records and we have 2 targets, T1- for unique values and T2- only for duplicate values. How do we pass the unique values to T1 and duplicate values to T2 from the source to these 2 different targets in a single mapping?
source--->sq--->exp-->sorter(with enable select distinct check box)--->t1
--->aggregator(with enabling group by and write count
function)--->t2
If u want only duplicates to t2 u can follow this sequence
--->agg(with enable group by write this code decode(count(col),1,1,0))--->Filter(condition is 0)--->t2.
What r the diffrence between joiner transformation and source qualifier transformation?
Source qualifier – Homogeneous source
Joiner – Heterogeneous source
While importing the relational source defintion from database, what are the meta data of source you import?
Source name
Database location
Column names
Datatypes
Key constraints.
What r the unsupported repository objects for a mapplet?
Source definitions. Definitions of database objects (tables, views, synonyms) or files that provide source data.
Target definitions. Definitions of database objects or files that contain the target data.
Multi-dimensional metadata. Target definitions that are configured as cubes and dimensions.
Mappings. A set of source and target definitions along with transformations containing business logic that you build into the transformation. These are the instructions that the Informatica Server uses to transform and move data.
Reusable transformations. Transformations that you can use in multiple mappings.
Mapplets. A set of transformations that you can use in multiple mappings.
Sessions and workflows. Sessions and workflows store information about how and when the Informatica Server moves data. A workflow is a set of instructions that describes how and when to run tasks related to extracting, transforming, and loading data. A session is a type of task that you can put in a workflow. Each session corresponds to a single mapping.
What r the types of metadata that stores in repository?
Source definitions. Definitions of database objects (tables, views, synonyms) or files that provide source data.
Target definitions. Definitions of database objects or files that contain the target data.
Multi-dimensional metadata. Target definitions that are configured as cubes and dimensions.
Mappings. A set of source and target definitions along with transformations containing business logic that you build into the transformation. These are the instructions that the Informatica Server uses to transform and move data.
Reusable transformations. Transformations that you can use in multiple mappings.
Mapplets. A set of transformations that you can use in multiple mappings.
Sessions and workflows. Sessions and workflows store information about how and when the Informatica Server moves data. A workflow is a set of instructions that describes how and when to run tasks related to extracting, transforming, and loading data. A session is a type of task that you can put in a workflow. Each session corresponds to a single mapping.
Suppose session is configured with commit interval of 10,000 rows and source has 50,000 rows. Explain the commit points for Source based commit and Target based commit. Assume appropriate value wherever required.
Source based commit will commit the data into target based on commit interval.so,for every 10,000 rows it will commit into target.
Target based commit will commit the data into target based on buffer size of the target.i.e., it commits the data into target when ever the buffer fills.Let us assume that the buffer size is 6,000.So,for every 6,000 rows it commits the data.
What are the reusable transforamtions?
Reusable transformations can be used in multiple mappings.When you need to incorporate this transformation into maping,U add an instance of it to maping.Later if you change the definition of the transformation ,all instances of it inherit the changes.Since the instance of reusable transforamation is a pointer to that transforamtion,You can change the transforamation in the transformation developer,its instances automatically reflect these changes.This feature can save you great deal of work.
What are the types of maping in Getting Started Wizard?
Simple Pass through maping :
Loads a static fact or dimension table by inserting all rows. Use this mapping when you want to drop all existing data from your table before loading new data.
Slowly Growing target :
Loads a slowly growing fact or dimension table by inserting new rows. Use this mapping to load new data when existing data does not require updates.
What r the types of maping wizards that r to be provided in Informatica?
Simple Pass through
Slowly Growing Target
Slowly Changing the Dimension
Type1
Most recent values
Type2
Full History
Version
Flag
Date
Type3
Current and one previous
What are Dimensions and various types of Dimensions?
Set of level properties that describe a specific aspect of a business, used for analyzing the factual measures of one or more cubes, which use that dimension. Egs. Geography, time, customer and product.
What are the session parameters?
Session parameters are like maping parameters,represent values you might want to change between sessions such as database connections or source files.
Server manager also allows you to create userdefined session parameters.Following are user defined session parameters:-
Database connections
Source file names: use this parameter when you want to change the name or location of
session source file between session runs.
Target file name : Use this parameter when you want to change the name or location of
session target file between session runs.
Reject file name : Use this parameter when you want to change the name or location of
session reject files between session runs.
What is Session and Batches?
Session - A Session Is A set of instructions that tells the Informatica Server How And When To Move Data From Sources To Targets. After creating the session, we can use
either the server manager or the command line program pmcmd to start or stop the session.
Batches - It Provides A Way to Group Sessions For Either Serial Or Parallel
Execution By The Informatica Server.
There Are Two Types Of Batches :
Sequential - Run Session One after the Other.
Concurrent - Run Session At The Same Time.
If a session fails after loading of 10,000 records in to the target.How can u load the records from 10001 th record when u run the session next time in informatica 6.1?
Running the session in recovery mode will work, but the target load type should be normal. If its bulk then recovery wont work as expected
Whats the diff between Informatica powercenter server, repositoryserver and repository?
Repository is a database in which all informatica componets are stored in the form of tables. The reposiitory server controls the repository and maintains the data integrity and Consistency across the repository when multiple users use Informatica. Powercenter Server/Infa Server is responsible for execution of the components (sessions) stored in the repository.
How can you access the remote source into your session?
Relational source: To acess relational source which is situated in a remote place ,u need to configure database connection to the datasource.
FileSource : To access the remote source file you must configure the FTP connection to the host machine before you create the session.
Hetrogenous : When U’r maping contains more than one source type,the server manager creates a hetrogenous session that displays source options for all types.
Difference between Rank and Dense Rank?
Rank:
2<--2nd position 2<--3rd position 4 5 Same Rank is assigned to same totals/numbers. Rank is followed by the Position. Golf game ususally Ranks this way. This is usually a Gold Ranking. Dense Rank: 1 2<--2nd position 2<--3rd position 3 4 Same ranks are assigned to same totals/numbers/names. the next rank follows the serial number. What is rank transformation?where can we use this transformation? Rank transformation is used to find the status.ex if we have one sales table and in this if we find more employees selling the same product and we are in need to find the first 5 0r 10 employee who is selling more products.we can go for rank transformation. In update strategy target table or flat file which gives more performance ? why? Pros: Loading, Sorting, Merging operations will be faster as there is no index concept and Data will be in ASCII mode. Cons: There is no concept of updating existing records in flat file. As there is no indexes, while lookups speed will be lesser. What is a command that used to run a batch? pmcmd is used to start a batch. What r the mapping paramaters and maping variables? Please refer to the documentation for more understanding. Mapping variables have two identities: Start value and Current value Start value = Current value ( when the session starts the execution of the undelying mapping) Start value <> Current value ( while the session is in progress and the variable value changes in one ore more occasions)
Current value at the end of the session is nothing but the start value for the subsequent run of the same session.
How do we estimate the depth of the session scheduling queue? Where do we set the number of maximum concurrent sessions that Informatica can run at a given time?
please be more specific on the first half of the question.
u set the max no of concurrent sessions in the info server.by default its 10. u can set to any no.
Where should you place the flat file to import the flat file defintion to the designer?
Place it in local folder.
Why we use partitioning the session in informatica?
Performance can be improved by processing data in parallel in a single session by creating multiple partitions of the pipeline.
Informatica server can achieve high performance by partitioning the pipleline and performing the extract , transformation, and load for each partition in parallel.
Why we use partitioning the session in informatica?
Partitioning achieves the session performance by reducing the time period of reading the source and loading the data into target.

Data WareHousing Interview Questions & Answers

Friday, November 14, 2008

Informatica Interview Questions And Answers IV

No comments:

Labels