Monday, October 4, 2010

Extraction, Transformation and Loading

ETL
SAP BW offers flexible means for integrating data from various sources. Depending on the data warehousing strategy for your application scenario, you can extract the data from the source and load it into the SAP BW system or directly access the data in the source without storing it physically in the Enterprise Data Warehouse. In this case the data is integrated virtually in the Enterprise Data Warehouse. Sources for the Enterprise Data Warehouse can be operational, relational datasets (for example in SAP systems), files or older systems. Multidimensional sources, such as data from other BI systems, are also possible. Transformations permit you to perform a technical cleanup and to consolidate the data from a business point of view.
Extraction and Loading
Extraction and transfer processes in the initial layer of SAP BW as well as direct access to data are possible using various interfaces, depending on the origin and format of the data. In this way SAP BW allows the integration of relational and multidimensional data as well as of SAP and non-SAP data.
BI Service API (BI Service Application Programming Interface)
The BI service API permits the extraction and direct access to data from SAP systems in standardized form. This can be SAP application systems or SAP BW systems. The data request is controlled from the SAP BW system.
File Interface
The file interface permits the extraction from and direct access to files, such as csv files. The data request is controlled from the SAP BW system.
Web Services
Web services permit you to send data to the SAP BW system under external control.
UD Connect (Universal Data Connect)
UD Connect permits the extraction from and direct access to both relational and multidimensional data. The data request is controlled from the SAP BW system.
DB Connect (Database Connect)
DB Connect permits the extraction from and direct access to data lying in tables or views of a database management system. The data request is controlled from the SAP BW system.
Staging BAPIs (Staging Business Application Programming Interfaces)
Staging BAPIs are open interfaces from which third party tools can extract data from older systems. The data transfer can be triggered by a request from the SAP BW system or by a third party tool.
Transformation
With transformations, data loaded within the SAP BW system from the specified interfaces is transferred from a source format to a target format in the data warehouse layers. The transformation permits you to consolidate, clean up and integrate the data and thus to synchronize it technically and semantically, permitting it to be evaluated. This is done using rules that permit any degree of complexity when transforming the data. The functionality includes a 1:1 assignment of the data, the use of complex functions in formulas, as well as the custom programming of transformation rules. For example, you can define formulas that use the functions of the transformation library for the transformation. Basic functions (such as and, if, less than, greater than), different functions for character chains (such as displaying values in uppercase), date functions (such as computing the quarter from the date), mathematical functions (such as division, exponential functions) are offered for defining formulas.
Availability Requirements for Data in SAP BW
For different business problems, the data might need to be more or less up-to-date.
For example, if you want to check the sales strategy for a product group each month, you need the sales data for this time span. Historic, aggregated data is taken into consideration. The scheduler is an SAP BW tool that loads the data at regular intervals, for example every night, using a job that is scheduled in the background. In this way no additional load is put on the operational system. We recommend that you use standard data acquisition that is schedule regular data transfers, to support your strategic decision-making procedure.
If you need data for the tactical decision-making procedure, data that is quite up-to-date and granular is usually taken into consideration, for example, if you analyze error quotas in production in order to optimally configure the production machines. The data can be staged in the SAP BW system based on its availability and loaded in intervals of minutes. A permanently active job of SAP background processing is used here; this job is controlled by a special process, a daemon. This procedure of data staging is called real-time data acquisition.
By loading the data in a data warehouse, the performance of the source system is not affected during the data analysis. The load processes, however, require an administrative overhead. If you need data that is very up-to-date and the users only need to access a small dataset sporadically or only a few users run queries on the dataset at the same time, you can read the data directly from the source during analysis and reporting. In this case the data is not archived in the SAP BW system. Data staging is virtual. You use the VirtualProvider here. This procedure is called direct access.

Source : http://help.sap.com

Sunday, October 3, 2010

SAP BW Data Warehouse Architecture

The different layers contain data in differing levels of granularity. We differentiate between the following layers:

● Persistent staging area
● Data warehouse
● Architected data marts
● Operational data store



Persistent Staging Area
After it is extracted from source systems, data is transferred to the entry layer of the data warehouse, the persistent staging area (PSA). In this layer, data is stored in the same form as in the source system. The way in which data is transferred from here to the next layer incorporates quality-assuring measures and the transformations and clean up required for a uniform, integrated view of the data.
Data warehouse
The result of the first transformations and clean up is saved in the next layer, the data warehouse. This data warehouse layer offers integrated, granular, historic, stable data that has not yet been modified for a concrete usage and can therefore be seen as neutral. It acts as the basis for building consistent reporting structures and allows you to react to new requirements with flexibility.
Architected Data Marts
The data warehouse layer provides the most multidimensional analysis structures. These are also called architected data marts. This layer satisfies data analysis requirements. Data marts are not necessarily to be equated with the terms summarized or aggregated; here too you find highly granular structures but they are focused on data analysis requirements alone, unlike the granular data in the data warehouse layer which is application neutral so as to ensure reusability.
The term “architected“ refers to data marts that are not isolated applications but are based on a universally consistent data model. This means that master data can be reused in the form of Shared or Conformed Dimensions.
Operational Data Store
As well as strategic data analysis, a data warehouse also supports operative data analysis by means of the operational data store. Data can be updated to an operational data store on a continual basis or at short intervals and be read for operative analysis. You can also forward the data from the operational data store layer to the data warehouse layer at set times. This means that the data is stored in different levels of granularity: while the operational data store layer contains all the changes to the data, only the days-end status, for example, is stored in the data warehouse layer.
The layer architecture of the data warehouse is largely conceptual. In reality the boundaries between these layers are often fluid; individual data memory can play a role in two different layers. The technical implementation is always specific to the organization.
source of data : http://help.sap.com