This post is from the CollabNet VersionOne blog and has not been updated since the original publish date.
CollabNet TeamForge Datamart
Definition and Purpose
A datamart, generally spoken, is a repository containing a subset of data collected in a data warehouse. It’s usually deployed to simplify and accelerate information access, often for a variety of stakeholders, which may include internal and external stakeholders. A datamart may support tasks such as data analysis, content presentation, reporting, trending analysis etc. Often datamarts don’t exist in isolation, however alongside other datamarts, each one focusing on specific business needs.
About TeamForge Datamart
Historical data can be reported for analytic purposes, and trending reports can be generated. Users can write ad-hoc queries or use many of the BI reporting tools available on the market to generate reports against the TeamForge datamart.
The TeamForge production database supports the day-to-day transactions, whereas TeamForge datamart cater to the BI reporting needs of the organization. Since the design of the TeamForge production database is optimized for handling daily transactional data (for read / write), it’s not the most effective repository to generate complex BI reports against. TeamForge datamart fills this gap by extracting, transforming and loading (ETL) aggregated data and pre-processes snapshots of production data in an easy to analyse and report format.
The TeamForge datamart is implemented in forms of star schemas for Users, SCM and Tracker. Data for the dimension and fact tables in the datamart is collected from the operational (transactional) database (TeamForge production database), through ETL processes. ETL processes are sequenced in the appropriate order and assigned to jobs. These jobs can be scheduled using an integrated job scheduler to run automatically at regular intervals. Upon successful job completion, data from the TeamForge production database is extracted, processed and loaded in to the TeamForge datamart tables.
All the jobs in TeamForge system can be categorized under two main job types, viz. Initial load & Incremental load. This job will be used to processing and loading of the historical data from the TeamForge production system. This might take some time depending on the volume of historical data. Thereafter, delta load (incremental load) jobs will be set up to run periodically extracting data from the production database into the data mart. The frequency and time of the ETL job runs can be set to periodic intervals by the users (typically administrators, with the necessary privileges). Users can also trigger the jobs manually as on need basis.
Users can also specify the time at which the reporting data is refreshed from the production database. Users can use reports to display data and group relevant information appropriately and specify intervals at which the datamart extracts TeamForge data from the datamart.
By using TeamForge datamart users can generate for example the following reports:
Users — Information on the number of users logged in each day and over a period of time.
SCM Activity — Information on the number of commits made in the SCM repository and information about the users who have made the commits. Additionally, counts of files added, deleted, modified, moved & copied.
Tracker — Information on reports on all artifact events – create, update, delete, close, re-open by day and by priority.
Find more information visit: http://help.collab.net/topic/teamforge620/action/reports.html