Database Modeling in UML

ERWin is a long term standard database modeling tool for building up ERD. Usually data architect spends lots of time on handling the relationships among entities, then utilizes forward engineering function to generate database schema.

During database modeling practices, an outstanding issue has been noticed that some of our clients made mistakes in transaction design. All the transactions of all kinds of activities are kept in one transaction table, with fields createdby, createdon, createdusing, modifiedby, modifiedon and modifiedusing, etc. Initially, the system might work properly. But with time goes, e.g., after 3 years, the transaction table is very likely occupies more than half of the disk spaces of the entire database. This brings performance issue when creating any transaction table related queries and reports. For my client’s instance, to get a report takes more than 20 minutes – which is unacceptable!

How do we resolve this issue? The answer is to trace all the created-stamps and modified-stamps in each activity table, instead of using one huge transaction table. But this brings another issue, what if there are hundreds of activity-related tables, when we do data modeling design, do we have to add these stamp fields to every table – hundreds times repeatedly? If we keep using traditional ERWin, the answer is Yes, unfortunately.

To avoid this, we need do data modeling in UML. There are a few tools out there which support data modeling in UML, including Oracle JDeveloper, IBM Rational Rose Data Modeler, Sparx EA, Microsoft Visio EE and Visual Paradigm, etc. After careful evaluation, we found that Visual Paradigm offers complete functions which allow us achieve data modeling in UML efficiently and smoothly. It’s a tool built on Eclipse as a plug-in.

A few key functions that Visual Paradigm offers:

  1. Import Visio VDX file (in ERD form), good for converting Visio-based ERD to UML;
  2. Reverse engineering from database;
  3. Generate UML from ERD (Sync to class diagram) – UML diagram as logical data model;
  4. Work on generalization and composition relationships in UML;
  5. Sync to Entity Relationship diagram (ERD) – ERD as physical data model before forward engineering;
  6. Generate DDL from ERD (forward engineering).

Please note, in step 4, since we are able to work on UML diagram, generalization relationship is available. So it’s easy to put all of those transaction stamp fields in an abstract class and this class can be inherited by all others. This saves tons of work and makes the diagram clearer. Please feel free to read Case Study: Data Modeling Transition for BI as a real world best practice for data modeling in UML.

References

  1. Data modeling in UML and ORM: a comparison, Visio, Microsoft
  2. Data modeling in UML and ORM revisited, Visio, Microsoft
  3. UML Data Modeling Profile, IBM
  4. Using Logical Models in UML for Database Development, Oracle
  5. Database Modeling in UML, Sparx Systems
  6. UML as a Data Modeling Notation, by David C. Hay, Michael J. Lynott