Wednesday, September 2, 2009

What's in those files in my Derby DB?

If you look inside your Derby database on disk, you'll see a number of files:


-bash-2.05b$ ls db/seg0
c101.dat c1b1.dat c260.dat c31.dat c3d1.dat c71.dat c8f1.dat ca1.dat
c10.dat c1c0.dat c271.dat c321.dat c3e1.dat c81.dat c901.dat cb1.dat
c111.dat c1d1.dat c281.dat c331.dat c3f1.dat c850.dat c90.dat cc0.dat
c121.dat c1e0.dat c290.dat c340.dat c400.dat c861.dat c911.dat cd1.dat
c130.dat c1f1.dat c2a1.dat c351.dat c411.dat c871.dat c921.dat ce1.dat
c141.dat c200.dat c2b1.dat c361.dat c41.dat c881.dat c930.dat cf0.dat
c150.dat c20.dat c2c1.dat c371.dat c421.dat c891.dat c941.dat
c161.dat c211.dat c2d0.dat c380.dat c430.dat c8a1.dat c951.dat
c171.dat c221.dat c2e1.dat c391.dat c441.dat c8b1.dat c960.dat
c180.dat c230.dat c2f0.dat c3a1.dat c451.dat c8c1.dat c971.dat
c191.dat c241.dat c300.dat c3b1.dat c51.dat c8d1.dat c981.dat
c1a1.dat c251.dat c311.dat c3c0.dat c60.dat c8e1.dat c990.dat


What are these files?

The Derby storage layer uses the concept of a conglomerate, which is a collection of related data stored in a single file. Each table in your database is stored in a conglomerate, and each index of each table is stored in its own conglomerate, so you have one file per table-or-index in your database.

Note that sometimes indexes are created explicitly, via CREATE INDEX, and sometimes indexes are created automatically, such as when you declare a UNIQUE constraint or a PRIMARY KEY constraint.

Derby tracks each conglomerate separately, by its assigned conglomerate number. The conglomerate number is used to create the file name by converting it to hex and constructing the file name cHHH.dat, where HHH is the conglomerate number in hex.

Derby records information about each conglomerate in its system catalogs, so you can simply run a query to find out which conglomerate is used by which table:


select c.conglomeratenumber, c.isindex, t.tablename
from sys.sysconglomerates c,
sys.systables t
where c.tableid=t.tableid;


So, for example, if you're wondering what is being stored in file c850.dat, you can simply convert 850-hex to 2128-decimal, and then run:


select c.conglomeratenumber, c.isindex, t.tablename
from sys.sysconglomerates c,
sys.systables t
where c.tableid=t.tableid
and c.conglomeratenumber=2128;


Or, if you're wondering which files are used by table EMPLOYEES:


select c.conglomeratenumber, c.isindex, t.tablename
from sys.sysconglomerates c,
sys.systables t
where c.tableid=t.tableid
and t.tablename='EMPLOYEES';


Note that about 60 of the files are used for the system catalogs themselves, so every Derby database will have at least 60 files in its storage directory.

No comments:

Post a Comment