Welcome to Data Innovations
Delivering successful data warehousing, master data management, data architecture and software integration projects since 2002. These projects have included the following services:
Data Innovations is an official service provider and software reseller for CA, Inc. We offer solutions to integrate CA software into our client's environments.
Explore our website to learn more about these products, services and solutions.
Data Modeling
Data modeling is a key component of any IT development effort. Data modeling is the process of determining data requirements from the business requirements. The data requirements are identified and captured in the entity relationship diagram. Once the logical model is analyzed and normalized, a physical data model is derived and used to establish and maintain the new database.
The entire data modeling process is simplified through technology. The data modeling products from CA, Inc. (CA) provide a powerful development environment for creating and maintaining relational databases throughout your entire data architecture. Leverage the CA software to manage and control the evolution of your data and the relational databases that capture and store your data.
The CA ERwin® Data Modeler (ERwin) software automates the entire data modeling process. ERwin allows the data modeler to create logical models through a graphical interface. All of the metadata necessary to identify the entities and attributes is captured in the logical data model. The physical models are derived from the logical models in the RDBMS of choice. The software helps to facilitate communication between the data modeler and the database administrator.
ERwin also provides the ability to reverse engineer physical data models from existing relational databases. This powerful tool allows the data modeler to capture the physical metadata for existing databases for analysis and maintenance. ERwin also provides the complete compare tool. This powerful tool allows the modeler to quickly identify differences between two databases, a data model and a database, or two data models. When these tools are leveraged properly the entire data architecture is managed and governed through the products.
Enterprise Application (EA) and Enterprise Resource Planning (ERP) packages are a growing trend in the industry. CA provides the CA ERwin® Saphir Option (Saphir Option) to reverse engineer the data models used by the EA and ERP packages. The Saphir Option supports SAP, SAP BW, PeopleSoft Enterprise, PeopleSoft Enterprise One (formally J.D. Edwards OneWorld), Siebel environments and the Oracle E-Business Suite. The ERwin data models created using the Saphir Option contain the English business names for the entities and attributes in the logical model and the EA/ERP proprietary and cryptic table and column names in the physical models. The Saphir Option unlocks the data model supporting the EA or ERP package for data modeling utilization.
The CA ERwin® Model Manager (Model Manager) software provides a repository to centralize and secure the ERwin data models for collaboration and use across the organization. The data models are organized in the Model Manager libraries and then the versioning capability is used to track and control changes to relational databases across the entire architecture. The Model Manager reporting capability allows the modeler to locate data within the data architecture for use in master data management, data warehousing, and other data sourcing efforts. Model Manager provides the tools necessary to manage and govern all of the ERwin data models.
The CA ERwin® Data Profiler (Data Profiler) software is used to profile the data. The Data Profiler infers metadata from the data content. This information provides the knowledge about the data content to create high-quality and accurate target data models for master data management, data warehousing, or other data sourcing efforts. The inferred metadata is also useful to identify data anomalies and data quality issues, as well as validating that the existing metadata matches the data content. This ensures that new and existing data models are of the highest quality possible.
The CA ERwin® Model Validator (Model Validator) software ensures the integrity of your physical data models. The Model Validator software performs diagnostic checks against your physical models. The Model Validator software uses Relational Technology rules to validate the structural integrity of the physical model. The software is ideal for new database administrators or for learning a new RDBMS for the seasoned DBA.
The CA ERwin® Model Navigator (Navigator) software provides access to the ERwin data models for project managers, programmers, business analysts, and other project resources that only require read-access to the models. The software provides all the functionality of ERwin for analysis and reporting, but the user is unable to save changes to the model. The Navigator product provides access to the ERwin data models at an affordable price.
The CA ERwin® Process Modeler (Process Modeler) software allows the modeler to create models to capture complex business processes. This software enforces the leading process modeling notations to ensure consistent business process models across the organization.
All of these products together provide all of the tools necessary to manage and govern the entire data architecture. The CA ERwin® Modeling Suite (Modeling Suite) software bundles the ERwin, Model Manager, Process Modeler, and Model Validator products together at a significant price reduction. Visit our software and solutions pages for additional information about data modeling and how to leverage these products in your data architecture.
Acquire ERwin
Acquire Model Manager
View data modeling solutions
View data modeling services
Enterprise Resource Planning (ERP)
Many organizations have invested heavily in the acquisition and implementation of Enterprise Resource Planning (ERP) packages to streamline specific business functions such as human resource, supply chain, financial, customer, warehouse, and decision support management. The ERP package itself often replaces two or more homegrown applications. The intention is to leverage the cross-functional capability of the package to replace disparate systems and consolidate the data into a single data model.
However, the data models that support these packages contain thousands and thousands of tables and attributes. Understanding where specific data is stored in these packages can be difficult at best. In addition, the names of the tables and attributes are proprietary and often cryptic. Even worse, the relationships between tables and attributes are enforced by the application, not the data structure itself. This creates serious challenges for performing data modeling and data sourcing activities against these packages.
Data Innovations is now providing solutions to support the reverse engineering and profiling of your ERP software. Our solution utilizes the CA ERwin® Saphir Option (Saphir Option) to transform ERP metadata into CA ERwin® Data Modeler (ERwin) data models. These models are leveraged to integrate the ERP into your IT environment. The models are ideal for identifying data to be profiled and sourced from your ERP into external business intelligence solutions, such as data warehouses or master data management solutions.
Enterprise Resource Planning jump-start package
Sourcing ERP Reference Data
Master Data Management (MDM)
Every organization has data that is dispersed across multiple applications. The information is usually stored by an application in data structures such as files or relational databases. The data itself is usually stored in different formats and in different forms specific to the application. The applications tend to be disparate from one another and address specific business functions that may or may not be related to one another.
For example, an insurance company may have several applications that capture policy information specific to a line of business. If the insurance company provides both commercial and personal insurance policies, the information captured by the applications is likely to be very different. An example of this might include workers compensation policies on the commercial side and medical insurance on the personal side. It is unlikely that a single application would capture both types of policies because they are completely different and have characteristics specific to the type of policy and line of business. However, the information captured in these applications may be very similar or the same in some instances and would be needed for other applications and business functions, such as an insurance claims application.
The data needed by the claims application would be described as reference data. Reference data is data that is non-transaction in nature and describes objects and their properties. In the current example, the policy and insured data would be reference data. The claim application needs to be able to map an insurance claim to a policy and the insured. However, consider if in our example that the person making the claim is a named insured on both a commercial workers compensation policy and a personal medical insurance policy. This could lead to problems when trying to reconcile claims made by the insured, especially if the data content is different between the policy applications.
In this circumstance, the insured information needs to be consistent between the two policy applications and now in the claims application. Master Data Management (MDM) helps to alleviate these types of problems by centralizing the reference data in a single location, usually described as a data hub. This approach allows for ongoing reconciliation between each of the applications so that we have consistent information for the insured, including consistent and correct data content between all of the applications. This is the primary goal of a MDM effort.
Data Innovations provides solutions to help develop MDM data hubs to solve these types of data problems. The following links provide overviews of some of these solutions:
Sourcing ERP Reference Data
![]()
Sourcing Reference Data
Data Profiling
Data profiling is the analysis of the data itself to infer metadata. Data profiling software is powerful technology when properly deployed and utilized. The inferred metadata is useful for many different purposes as described in the following sections.
Data Modeling
Data profiling infers detailed metadata at the column and table level, between tables, and across systems. The inferred metadata is ideal for validating column metadata, discovering and validating primary and foreign keys, and parent-child relationships within your data models. Data profiling ensures that your data models are accurately reflected by the data content. Use of data profiling increases the accuracy and quality of your data models significantly.
Data Profiling - Advanced Modeling ![]()
Data Warehousing
Data profiling is ideal for avoiding the code-load-explode development methodology for data warehousing efforts. Profile the data from each data source to identify the data content and overlap between sources. Leverage this analysis to create accurate ETL specifications to correct data anomalies, data quality problems, and consolidate data values. Leverage this same analysis to prototype, test and create accurate target data models for your data warehouse. Ensure the data in your data warehouse is of the highest quality by regularly profiling the data content to identify and address data anomalies and data quality problems.
Data Profiling for Data Warehousing ![]()
Master Data Management
Master data management is the defining and managing the master reference data of an organization. Data profiling is ideal for performing the detailed analysis necessary to identify and harmonize the master reference data that is derived from multiple sources prior to populating a MDM data hub. Leverage the profiling results to create MDM validations for ongoing data quality and to ensure that only valid data is syndicated across your systems. Ensure your master reference data is of the highest quality possible by regularly profiling the data content in the data hub for data anomalies and data quality problems.
Sourcing ERP Reference Data
![]()
Sourcing Reference Data
![]()
Data Mapping
Implementing any new system, whether a purchased package or a custom developed application, requires some level of data mapping to integrate the new system with existing applications. The results of data profiling are ideal for mapping data between applications. Leverage the overlap and transformation discovery analysis to automate many of the mappings between disparate applications and data sources.
Data Quality
The inferred metadata produced by data profiling is ideal for locating data anomalies and data quality issues in existing data sources. Perform regular data quality assessments against your production data using data profiling to ensure that your data is of the highest quality possible. Leverage data profiling to enforce data quality standards across your data architecture for data governance efforts as well.
Business Rules
Data profiling is invaluable for identifying the business rules embedded within the data content of a data source. This is useful for understanding legacy applications or Enterprise Applications (EA) and Enterprise Resource Planning (ERP) packages that are not well documented and known.
Understanding the Products
There are a number of different data profiling products available in the market today. This is confusing at times because there are a lot of similarities between the software products. The software vendors are not helpful because they all claim to perform the same functions. The following sections are intended to describe and distinguish the CA and Exeros profiling software products from other profiling products.
Basic Profiling
The CA ERwin® Data Profiler (Data Profiler) and the Exeros Discovery X-Profiler™ (X-Profiler) provide the basic column, table, and cross-system profiling. What separates the Data Profiler and X-Profiler from other profiling software products is that the primary keys, foreign keys, and overlaps between data sources are automatically inferred from the data content. Performing these operations in other profiling software requires additional analysis and significant manual effort to produce the same results.
Advanced Profiling
The Exeros Discovery Transformation Analyzer™ (Discovery) is the most advanced profiling software product in the industry. This product includes all of the basic profiling provided in the Data Profiler and X-Profiler software products. However, this product also automates the discovery of complex transformation and business rules (concatenations, substrings, look up tables, reverse pivots, complex case logic, aggregations etc.) between disparate data sources. This simplifies data consolidation for data warehousing and master data management efforts by automating work that is performed manually using basic profiling technology.
The Exeros Discovery Unified Schema Builder™ (USB) also provides functionality that is unique in the industry while also including all of the basic profiling provided by the Data Profiler and X-Profiler software products. USB allows the data modeler to analyze multiple data sources and then prototype the combination of those sources into a new target by proposing rules and populating a target schema before writing any ETL code. The profiling results identify data content inconsistencies between the data sources to quickly identify transformations for data anomalies, data quality, and data consolidation problems. This powerful modeling and data prototyping technology is ideal for creating data warehousing and master data management target data models.
Data Innovations is the leading data profiling service provider delivering successful data profiling projects for clients since 2002.
DI consultants have a proven track record for establishing the profiling environment, developing bulletproof data analysis methodologies, and mentoring clients to successful utilization of the technology. For more information on how DI can help your organization leverage data profiling contact us at solutions@dataprofilers.com.
Data Profiling Quick-Start Solution
Data Profiling Jump-Start Package
| How do organizations benefit from data profiling? | Close |
Implementing data profiling eliminates the code-load-explode development methodology for data warehousing or master data management projects that occurs when ETL specifications are created based mainly on the institutional knowledge in the head of the business subject matter expert (SME) ,. The SME based specifications are utilized to develop the ETL code. Unit testing the code will often cause the code to explode due to unexpected problems with the data. The problems are identified and sent back through the development process to the SME to analyze and correct the ETL specifications.
These types of data problems are usually located incrementally, causing the process to repeat itself several times. Finally, when the code passes unit testing, it is then moved on to the next level of testing, system testing. The code-load-explode process then repeats itself again because more data is utilized during system testing. Problems uncovered during system level testing have to go all the way back to the beginning of the development process. This process is repeated throughout the different levels of testing and often leads to project overrun. The cost to the organization is easily calculated by tracking the development hours for reworking the code.
Even scarier is that the code-load-explode approach can allow bad data to make it into production targets. The cost of correcting data problems in production is expensive, but the cost to the organization may be even more expensive if erroneous business decisions are being made because of the data.
Can your business afford to have erroneous data in business intelligence applications or transaction systems?
Data profiling eliminates the code-load-explode method because the profiling software allows the SME to review all of the data content and the inferred metadata to get the specifications right the first time. The profiling results provide the detailed information necessary to create accurate ETL specifications. However, this is only one of the many ways that data profiling software can be leveraged by the organization.
Ensuring Business Continuity for Virtualization
Virtualization is a growing trend in the IT industry. Organizations are leveraging virtualization in distributed environments to reduce their overall physical server footprint and maximize the utilization of each physical server. Reducing the number of physical servers decreases the carbon emissions and power consumption by the organization. This is the right thing to do for the environment and can reduce the organization's power and support costs.
However, introducing virtualization increases the complexity of ensuring critical applications, such as company email or front-end applications are always available. Having data recoverability with robust and proven solutions is also essential to all virtual solutions.
CA, Inc. (CA) provides software to address business continuity for virtualization. The following solution identifies two CA software products that address high availability and disaster recovery for distributed virtual environments.
Going Green" Jump-Start Package ![]()
Data Governance webcast materials now available for download
Harmonizing Reference Data webcast materials now available for download
Leveraging Data Profiling webcast materials now available for download
Data Innovations in the News:
CA Data Modeling Suite Offers Data Profiling - www.searchitchannel.com
Data Innovations in the News:
CA Brings Data Modeling to the Masses - www.channelinsider.com
Exeros Signs Data Innovations as a New Reseller and Integration Partner - www.reuters.com
Exeros Launches X-Profiler Product - Exeros Delivers Full-Featured Cross-System Profiling Software at an Affordable Price - www.reuters.com