MDDB
Up | Previous | Next | Down
About
This is a first draft of a system as described in this thread at Hydrogenaudio.
CoreLib is a similar effort and will possibly be merged with MDDB. Discussion of both systems currently takes place on the CoreLib mailinglist (click on "more information about this list" to go to a page where you can subscribe).
The basic idea is to create a separate personal database of metadata. This has several advantages (see the thread linked to above for the full details):
- Modifying the metadata can be done independent of the file, which means it can be done faster (at least in theory).
- You aren't confined to the possibilities of the tag format your file format uses.
- It is trivial to give an encoded version of a song EXACTLY the same tag as the original.
- Metadata isn't removed when you remove the file.
- When re-ripping your collection only minimal effort should be required to get all the proper tags again.
- For a discussion of this idea and its implementation details, see this thread at Hydrogenaudio.
This implementation is still very much in development, nothing is guaranteed to remain the same yet and most things don't work at all.
A few weeks ago (March 20, 2004) a discussion took place on IRC, the transcript can be read here
NOTE: If you would like to participate in development, please reply to this thread at Hydrogenaudio or (if you are not a member of Hydrogenaudio) mail me.
This project is closely tied to RDFDB, which is the RDF server on which MDDB is currently based.
Downloads
The latest version was uploaded on July 29, 2004.
NOTE: There is some documentation further down this page!
- MDDB 2004-07-29 sources (88 KB)
- The GetMetadata syntax has been completely revised. The new syntax is mostly backwards compatible, the most important difference is that the target and arc components are switched (so
@dc:title
now becomes dc:title
), also quoting URIs is a bit stricter in the sense that QNames should not be quoted and URIs should always be quoted (although this isn't noticeable yet because RDFDB doesn't make any distinction between them).
Previous release notes:
GetMetadata now returns all children when you specify an empty string as key ("" is now the same as "*"), and it also allows for "/" now (needs the new RDFDB). This also contains a simple shell extension that can show some information from MDDB in certain columns in Explorer (artist and title). Besides those changes the code has been cleaned up a bit in some places and a few bugs were fixed. Previous release notes:
ML_LooseIdentifyFile to retrieve possible identifications, Init/Deinit functions, newline quoting and a bugfix.
Previous release notes:
This release adds support for more advanced error reporting, updates RDFDB_QueryNext to the new output format of RDFDB and makes RDFDB_QueryNext a bit more robust.
Previous release notes:
The most important changes in this release are a completely new GetMetadata function and a change in policy regarding calls to Close and Destroy functions (see the version history below for more information). The DoxyGen documentation has also been updated, users of the library should have a look at the Modules section of the documentation.
- MDDB 2004-07-29 binaries (81 KB)
- A compiled version of the above sources (contains MDDBLib.dll, MDDBTool.exe and ColumnHandler.dll).
If you have any questions, just mail me.
GetMetadata query syntax
The new syntax for the GetMetadata functions was based on XPath2 and currently looks as follows (see RFC2234 for a description of the grammar syntax):
RDFPath = AbsoluteRDFPath / RelativeRDFPath
AbsoluteRDFPath = "/" RelativeRDFPath
RelativeRDFPath = Course *("/" Course)
Course = "." / ([Selector "!"] TripleSelection ["#" Type] ["[" Constraint *("," Constraint) "]"])
Selector = "subject" / "subj"
/ "predicate" / "pred"
/ "object" / "obj"
/ "statement" / "stat" ; Note that the interpretation of subject and object depends on whether or not the backlink operator is used.
TripleSelection = [".."] [Arc] ["@" Target] ; The third variable is the context node. Note that the role of source and target are inverted by the backlink operator.
Type = URI / QName
Constraint = RelativeRDFPath [ComparisonOperator Target] ; If the comparison operator and the target are ommited the path is only checked for existence.
Arc = "*" / URI / QName
Target = URI / QName / Literal
URI = "<" URIReference ">"
QName = [Prefix ":"] LocalPart
ComparisonOperator = "=" / "!=" / ">" / "<" / ">=" / "<=" / "~" / "!~"
Literal = Number / (String ["^^" DataType])
Prefix = NCName
LocalPart = NCName
Course should be read as:
All nodes with the URI Target, which are an Arc property of the context node and are of type Type.
If a certain URI isn't present it means that the selection is not restricted by that part of the PathPart. A PathPart that starts with '..' is interpreted as a backlink, in that case the above sentence would become:
All nodes with the URI Target, which have an Arc property that has the context node as value and are of type Type.
If the query starts with a slash (/) the query is considered to be an absolute path. To retrieve all property values of a certain URI you can use "/@URI/", to retrieve all nodes of a certain type you can use "/#URI".
Certain restrictions apply if you use a Literal in the PathPart, for one thing you can't use a Literal with a backlink, as literal can't be the source of a triple.
Also, the next PathPart should be a backlink, again because a literal can't be the source of a triple.
There is also one other case that doesn't work that's a query of the form /@<target>[dc:title='sometitle']
.
Examples
- dc:title
- <http://purl.org/dc/elements/1.1/title>
- #mm:Album/dc:title
- /@<http://domain/someArtist>/dc:title
- /@2/..mm:trackNum
- /@"Madonna"/..dc:title/..dc:creator/dc:title
- /#mm:Artist
- /#mm:Artist/..dc:creator/dc:title
That last example would retrieve the titles of all nodes that have an artist as creator.
Shell extensions
Currently there is one shell extension available for MDDBLib, a ColumnProvider that enables you to show certain information from MDDB in Explorer (currently it only shows the title - @dc:title - and the artist - @dc:creator/@dc:title).
System Requirements: The ColumnProvider (ColumnHandler.dll) needs at least Windows 2000.
To install the ColumnProvider (ColumnHandler.dll) first make sure that MDDBLib.dll is either in your Windows directory or in some directory in your path (System Properties\Advanced\Environement Variables, it's the PATH environment variable, case doesn't matter). Good places to put it are your Windows\System32 directory or in a directory of its own (in Program Files for example), but then you will have to add this directory to your path. Now you can put ColumnHandler.dll somewhere (doesn't matter much where it is, but it's a good idea to keep it somewhere where you can find it). Now run the following command from the directory where you put ColumnHandler.dll:
regsvr32 ColumnHandler.dll
Explorer should now list "MDDB Title" and "MDDB Author" (these names might change in the future, but for now these are used to make it easy to debug the dll) as possible columns in the "Choose details" dialog (look in the View menu). If you select these columns you should see them appear in the details view. Normally they should be empty, but if a filename contains a LUID (for example: "My Song {LUID12345678}.mp3") ColumnHandler.dll will query MDDB for this LUID and show the results in its columns. To stop making use of ColumnHandler.dll you can issue the following command from the directory you put it in:
regsvr32 /u ColumnHandler.dll
To let ColumnHandler.dll find anything you can populate the database by loading RDF into it through one of the RDFDB frontends (use the load query). For example the following bit of RDF (using the Turtle syntax) could be used to describe a song:
<luid:12345678>
dc:title "My Song" ;
dc:creator <some_artists_URI> .
<some_artists_URI>
dc:title "Some Artist" .
If this was loaded into RDFDB (using load turtle <file://path_to_file> into database_name
) and you were to view a directory with Explorer which was configured to show the MDDB columns and contained the file "My Song {LUID12345678}.mp3", ColumnHandler.dll would show "My Song" in the "MDDB Title" column and "Some Artist" in the "MDDB Author" column.
ColumnHandler.dll tries to connect to RDFDB on the localhost at port 7001 and tries to use the "foo" database by default. This can be configured by overriding these settings in the registry, the dll looks in the following places (in this order, if it doesn't find a certain setting it looks a level lower until it has looked everywhere and then it uses the defaults, this also means you can selectively override settings):
- HKEY_CURRENT_USER\Software\MDDB\ShellExtensions\ColumnProvider
- HKEY_CURRENT_USER\Software\MDDB\ShellExtensions
- HKEY_CURRENT_USER\Software\MDDB
- HKEY_LOCAL_MACHINE\Software\MDDB\ShellExtensions\ColumnProvider
- HKEY_LOCAL_MACHINE\Software\MDDB\ShellExtensions
- HKEY_LOCAL_MACHINE\Software\MDDB
The following settings are supported:
HostName
(String Value)
- Can be either string representation of an IP address or a hostname that can be resolved to an IP address. It should be set to point to the server RDFDB is installed on.
Port
(DWORD Value)
- The port number RDFDB listens on.
DatabaseName
(String Value)
- The name of the database ColumnHandler.dll should use.
UserName
(String Value)
- The username ColumnHandler.dll should use to log in on the database.
Password
(String Value)
- The password ColumnHandler.dll should use to log in on the database.
Version history
July 29, 2004 (corresponds to RDFDB 0.47 mod 13):
- Slightly changed the GetMetadata syntax and made it much more powerful (makes use of a new separate RDFPath parser).
- Updated string quoting support.
- Some small internal changes.
July 15, 2004 (corresponds to RDFDB 0.47 mod 12):
- Updated to the new RDFDB version (login support).
- GetMetadata queries now support literals as targets (mostly useful with backlinks, as in
/2/..
which returns all nodes that have a property whose value is 2), also started work on supporting constraints (as in /#mm:Artist[dc:title="Madonna"]
).
- Added a performance test to MDDBTool.
- Finally exported the ML_RemoveProperty (added it to the .def file).
May 30, 2004 (corresponds to RDFDB 0.47 mod 10):
- A basic shell extension makes its debut in this version, it's a ColumnProvider that can show the artist and title of files in Explorer (the files need to have the right LUID in their filename).
- URIs created from LUIDs now always start with "luid:" (if a file is called "{LUID1234}" ML_IdentifyFile will return the URI "luid:1234"), this has been done to make it easier to enter information about LUIDs in RDF/XML, N-Triples or Turtle (without this change the LUIDs would be interpreted as relative URIs). ML_LooseIdentifyFile doesn't take notice of such things, it only cares about URIs.
- ML_GetMetadataOpen now has support for "" (empty string, equivalent to "*") and "/" (returns all nodes, or in this implementation all sources).
- Fixed a bug in ML_LooseIdentifyFile (it would crash on some inputs due to freeing too much memory.
- Made sure ML_IdentifyFile would only find one LUID (it would simply continue searching for a LUID after it had found one, which could cause a memory leak.
May 22, 2004 (corresponds to RDFDB 0.47 mod 9):
- Added ML_Init and ML_Deinit to make initing easier (not everything works from within DllMain).
- Added the ML_LooseIdentifyFile function that looks for possible matches based on certain information.
- RDFDB_QueryNext now handles the error reporting capability of RDFDB correctly (it used to ignore it due to a stupid mistake I made).
- Created the infrastructure for supporting more than one server type (added some parameters for server and query types).
- Newlines are now quoted and unquoted in string literals, which results in more reliable operation (the only weaknesses in this department are in column counting and very strange - and invalid! - URIs).
May 11, 2004 (corresponds to RDFDB 0.47 mod 8):
- Added the ML_GetLastError function to allow for more advanced error reporting.
- Made (un)quoting more flexible both " and ' are supported now.
- Changed RDFDB_QueryNext to be compatible with the new output format of RDFDB.
- RDFDB_QueryNext has less problems with tabs in text literals now, it only has problems with them if it has to count the number of columns (usually never or only on the first row). Problems with newlines in text literals remain.
- ML_GetMetadata is now able to report certain kinds of syntax errors.
May 1, 2004 (still based on RDFDB 0.47 mod 7):
- Added the ML_RemoveProperty function (as of yet untested, but it uses the same code as ML_PutProperty, so it should be okay).
- The ML_GetMetadata function has been replaced by ML_GetMetadataOpen/Next/Close.
- The syntax of the ML_GetMetadata queries has changed radically, it's now more like XPath (I used XPath2 as reference).
- Changed the policy regarding who calls Close and Destroy functions, the user of the library only has to call Close if he wants to prematurely stop a query, Destroy functions should be still be called, but there are a lot less of them (all Destroy*State and DestroyConnection functions have been merged with Close and Disconnect functions).
- Added a new error value that is to be used by a frontend in development by nerochiaro.
- Updated the DoxyGen documentation, it's now reasonably useable (the Modules section lists all the interesting parts for users of MDDBLib).
- Renamed some files and removed some old stuff.
April 24, 2004 (corresponds to RDFDB 0.47 mod 7):
- MDDBTool now actually has some functionality! It can enumerate the properties and backlinks of URIs passed on the commandline.
- Added the ML_PutProperty function.
- Added the flags field to ML_FSData, to make it possible to distinguish backlinks.
- Modified the RDFDB Query functions to be more compatible with non-select queries (RDFDB_Query is the recommended query function for these types of queries now).
- Made it possible to quote URIs when using ML_GetMetadata (you can now query for <dc:title> for example, but also for <http://purl.org/dc/elements/1.1/title>, which used to fail because it saw the dot as a delimiter).
- Replaced all QNames (rdf:type for example) by their equivalent absolute URIs (http://www....#type).
April 15, 2004 (corresponds to RDFDB 0.47 mod 6):
- ML_FindObjectByTypeOpen/Next/Close is implemented.
- ML_FindPropertyByClassOpen/Next/Close is implemented.
- Made sure no memory leaks occur (at least under normal circumstances, there might be some left in error situations).
- Some bugfixes and clean ups (most notably the error handling of RDFDB_Query and RDFDB_QueryCallback was improved).
March 21, 2004 (corresponds to RDFDB 0.47 mod 5):
- RDFDB_QueryOpen/Next/Close is implemented.
- ML_FindPropertyOpen/Next/Close is implemented (at least some basic support).
- Code clean ups and a small bugfix.
March 19, 2004 (corresponds to RDFDB 0.47 mod 5):
- RDFDB_Query and RDFDB_QueryCallback now use mostly the same code.
- Updated to work with RDFDB mod 5.
March 14, 2004 (first release, corresponds to RDFDB 0.47 mod 3):
- ML_IdentifyFile has VERY basic functionality (can only scan a filename for a LUID).
- ML_GetMetadata has basic functionality (you can give it a URI instead of a LUID and it will perform simple queries).
- RDFDB connections seem to work.
Up | Previous | Next | Down
To contact me, please mail to: th.v.d.gronde@hccnet.nl
I hope you'll enjoy my program(s).