Category Database

Generating Data Model using SchemaSpy

Often we need work on existing database schema. How to quickly master the database structure can be a challenge since we might not be able to get the data model ERD diagram from project repository.

However, as long as you get access to the database, then you should be able to generate the data model using SchemaSpy. SchemaSpy is LGPL-based, developed by John Currier in 2004. It has been improved constantly since then.

SchemaSpy is written in Java and deployed as a jar file. Current version is 6.1. It’s a command line tool. The output is a folder with html and other files which outline the database tables, views and their relationships. But before you run, you need have JDBC driver and GraphViz ready...

Read More

Batch Data Processing Architecture for Visualization and Analytics

One of my municipality clients has a requirement to handle batch data from multiple data sources for visualization and analytics needs. The data source can be ERP system, sensors or internal databases. Sensors load data into time series database continuously. It requires BI dashboard show charts based on latest data in near real-time manner.

Let’s see how we design the architecture to satisfy the requirements.

Batch Data Processing for Visualization and Analytics

First, we have data sources listed in the left, including ERP, time series database-based backend systems, other databases and data flow from APIs ;

Second, we have ETL process to load data from data sources to DVA platform. You may choose any ETL tool you like, but we highly recommend Lionsgate Software’s LiveSync Au...

Read More

Seven Steps to Effective Data Governance

The concept and discipline of data governance has grown in importance as organizations are forced to comply with regulations, cut costs, integrate among different COTS (Commercial-off-the-shelf) systems, provide data interoperability through internal and external APIs.

One of my retail clients has large daily transaction data flow among retail, wholesale inventory, financial, auditing and warehouse departments, etc. It has separate COTS systems such as JDA ESO, eCommerce Shopify and Tecsys SCM, etc. It also has in-house store master, product master data, etc. In some system, SKU is defined as 6-digits number; in another, it’s defined as varchar. In such a complex environment, data management turns to be an outstanding issue.

In order to improve customer experience, increase efficien...

Read More