IPTC/XMP for hierarchical categories + internal DB

Ideas for improvements and requests for new features in XnView Classic

Moderators: helmut, XnTriq, xnview

User avatar
Olivier_G
XnThusiast
Posts: 1423
Joined: Thu Dec 23, 2004 7:17 pm
Location: Paris, France
Contact:

IPTC/XMP for hierarchical categories + internal DB

Post by Olivier_G »

I have been thinking at various ways to correctly organize pictures, and wondered if XnView could benefit from this.

Small summary:
- prefer standards (IPTC, XMP, etc...)
- decentralize data (metadata in the file) as much as possible
- complement with an internal DB to efficiently handle data + when data cannot be decentralized (ie: file formats not supporting IPTC, etc...)

Notes:
- XMP is not yet widely used (except Adobe), but IPTC may adopt XMP for its next version and get a boost. XMP should be on the list...
- IPTC can be used for JPEG, TIFF, PSD as well as some RAW (NEF, THM+CRW) and maybe other formats.
- From what I can see in IPTC and XMP, you should not expect soon any possibility for hierarchical categories (it would imply standardization of those categories, or to encapsulate the whole structure in all files), except if they adopt the same kind of solution as the one explained later... :mrgreen:
- Hierarchical categories are great for organizing images (most IMS use them with good reasons).


The solution I have come up with is to store the complete hierarchical path of each category into the IPTC field.

As an example, if I have the following categories:
- Animal
- - Cat
- - Bird
- - - Eagle
- - - Sparrow
...
and would like to use the category "Sparrow", I would use "Animal|Bird|Sparrow" in the IPTC field.

The program would then replicate this information in an internal database to speed up: searching, browsing through categories, change a categorie, move an entire branch into another node, etc... (think drag and drop)
By storing the path of files, the program could also allow to save/restore those data if they have been corrupted, to import/export/edit them.
And the internal database would manage directly the files for which no IPTC/XMP data can be encapsulated.

I can see a strong limitation: this IPTC field is usually limited to 32 or 64 characters... however nothing prevent from using a higher limit (128, 256...), which would meet the requirements for this solution.

What is your opinion ?

Olivier
PS: This discussion is not in competition with the current ones about categorizing and m3u, but rather as a complement, as those would deal with the internal database issues...
german_beaver
Posts: 5
Joined: Sat Dec 18, 2004 9:27 am

Post by german_beaver »

Wow! It's great that there are also many other people who are interested in implementing categorizing/database feautures in XnView! Go on, folks! ;)
The idea itself is interesting. The pro of hierarchical categories in IPTC data would be that it use existing standard. However by using overlength data strings in "category" field we would bypass this standard anyway.
Olivier_G wrote:...however nothing prevent from using a higher limit (128, 256...), which would meet the requirements for this solution...
Is file with overlength categorie still legal? Can it be imported by other programs correctly?
Besides I think XnView team will not agree to implement database because database is the main feauture of XnView Deluxe, the shareware version of XnView. :(
User avatar
Olivier_G
XnThusiast
Posts: 1423
Joined: Thu Dec 23, 2004 7:17 pm
Location: Paris, France
Contact:

Post by Olivier_G »

(All this stuff is not up to date anymore, and can simply be skipped)
german_beaver wrote:However by using overlength data strings in "category" field we would bypass this standard anyway.
Is file with overlength categorie still legal? Can it be imported by other programs correctly?

IPTC-IIMv4.1 (July 1999) says that supplemental categories should be limited to 32 characters max, and moreover that this field is deprecated and will probably not be included in future versions. However, since 1999 more and more programs have adopted IPTC and make use of this particular field. An interesting example is that Adobe uses it in its latest versions of its own metadata set, and we can expect some influence in the next IPTC version (cf: use of XMP). In practice, the length of characters depends on how the developer wants to implement it in its own software (ex: XnView = 64... :) ).
Bottomline: this part of the standard is already touchy, and the next version of IPTC should clarify things. Meanwhile there is an interesting opportunity to use this field and to relax the standard as well, as long as it doesn't interfere with other programs...

...and there, the situation is a mixed bag:
good
A. some programs will just manage long fields as well
B. some programs will be able to show the long categories + edit/add short ones only (ex: BreezeBrowser)
bad
C. some programs will be able to show the long categories, but refuse to save anything as long as any word is too long (ex: Pixvue)
D. some programs may refuse to show anything not within the standard

My guess is that the vast majority of softwares are in B and C.
While B does not interfere with the solution, C might get problematic (example: if you send a lot of your pictures with 'overlength IPTC words' to an Agency who edit the IPTC with a software in C... they will get a surprise :mrgreen: ). This potential problem could be solved with a function to intelligently restrict the IPTC length for selected images.
It would be interesting to know where the main softwares stand (A/B/C/D ?)... in order to have an idea of the importance of the problem.

Side-Note: If Pierre prefers to follow strictly IPTC standards (I can understand that as well), he would then need to change the current 64 max to 32...
german_beaver wrote:Besides I think XnView team will not agree to implement database because database is the main feauture of XnView Deluxe, the shareware version of XnView. :(
Right, but more and more image softwares use categories/albums, and organizing pictures could become de facto a minimum requirement for choosing a software => it may be preferable to include categories in the free version, and add more elaborated features in the Deluxe version...
Just my 2 cents, though, as this is clearly Pierre's Business (au propre comme au figuré :wink: )

Olivier
Last edited by Olivier_G on Mon Dec 26, 2005 11:26 am, edited 3 times in total.
User avatar
Olivier_G
XnThusiast
Posts: 1423
Joined: Thu Dec 23, 2004 7:17 pm
Location: Paris, France
Contact:

Don't relax... be imaginative!

Post by Olivier_G »

(All this stuff is not up to date anymore, and can simply be skipped)

Thinking again about this...
Although I much prefer the 'hierarchical categories complete path' proposal (simple, meaningful, inclusive and able to simulate the hierarchy with the usual flat system of IPTC), the 32 chars limit standard and the problem of compatibility with some current softwares is a major issue :(

OK, then... let's get imaginative with those 32 characters. :mrgreen:
IPTC-IIMv4.1 says: "Repeatable, maximum 32 octets, consisting of graphic characters plus spaces."
I have just tested that the system font (things like $&½...) works fine with XnView, Pixvue and IrfanView, so we should have 256 possibilities per character => we could use codes for categories, and include the category code<->name correspondance.

Here is the most efficient option I could come up with:
- 3 chars code for each category (unique)
- the complete hierarchy path is included
- the name correspondance of each category of the hierarchy is included
Back to my example:
- Animal [AAA]
- - Cat [AAB]
- - Bird [AAC]
- - - Eagle [AAD]
- - - Sparrow [AAE]
- - Dog [AAF]
- - - Boxer [AAG]
- - - Bulldog [AAH]
To use the category "Sparrow", I would put in the IPTC fields:
- "|AAAAACAAE" for the complete path.
- "|Animal=AAA"
- "|Bird=AAC"
- "|Sparrow=AAE"
=> This would give 16M possible categories in total, with a max depth of 10 levels. Names would be limited to 27 characters. No limit on association category<=>file.

I also came up with a more complex option, based on 2 chars code restarting at each node + use of prefix & level to identify the category within the complete hierarchical path as follow:
- AA
- - AA
- - AB
- - - AA
- - - AB (sparrow)
which would give: "|AAAABAB" (first A to identify the hierarchical path within IPTC), "|Animal=AA" (path 'A', first level), "|Bird=AB" (path 'A', second level) and "|Sparrow=AC" (path 'A', third level). Unfortunately, this option with a lot of possible categories (65k categories, and 65k sub-categories for EACH category, and so on... up to 15 levels) would probably require more computing power during replications between files' IPTC and internal DB...


Any comment from the XnView team ?

Olivier
Last edited by Olivier_G on Mon Dec 26, 2005 11:28 am, edited 1 time in total.
User avatar
xnview
Author of XnView
Posts: 46236
Joined: Mon Oct 13, 2003 7:31 am
Location: France
Contact:

Post by xnview »

Olivier_G wrote:
german_beaver wrote:Besides I think XnView team will not agree to implement database because database is the main feauture of XnView Deluxe, the shareware version of XnView. :(
Right, but more and more image softwares use categories/albums, and organizing pictures could become de facto a minimum requirement for choosing a software => it may be preferable to include categories in the free version, and add more elaborated features in the Deluxe version...
Just my 2 cents, though, as this is clearly Pierre's Business (au propre comme au figuré :wink: )
Yes, currently it's the feature of XnView Deluxe, but i think that 's XnView will have too some DB or categories feature.
But i don't understand why you want to have complete hierarchical path in IPTC field, if categories are stored in external DB...
Pierre.
User avatar
Olivier_G
XnThusiast
Posts: 1423
Joined: Thu Dec 23, 2004 7:17 pm
Location: Paris, France
Contact:

Post by Olivier_G »

xnview wrote:But i don't understand why you want to have complete hierarchical path in IPTC field, if categories are stored in external DB...
Your hierarchical organization can 'travel' with the file without any additional operation. And at the Destination place:
- A software using the usual flat IPTC system would be able to list all the categories the image belongs to => data has been preserved.
- A software that imports IPTC data into its own DB would be able to re-arrange correctly all files into their right categories (flat but inclusive at first, in hierarchical categories with a bit of work - that could be automatized with a script for iMatch, iView, etc...)
- A software that uses exactly the same system would replicate automatically the right hierarchical organization immediately.

Advantages:
=> For exchangeability, you get the best solution as you can get the most from the Destination software, whatever its capabilities, with a simple process. This can be important if you work with agencies or third parties. Ideally, this system could be implemented in other softwares and become a kind of exchange protocol based on IPTC standard (and anyone could create converters from/to proprietary DB, exchange file formats, etc...).

=> For perenniality, users would be on the safe side as they would know that all data and the organizational structure is stored in the IPTC standard and can be re-used by other software (cf exchangeability) - and this is a major issue for advanced/pro photographers...

=> From a backup point of view: people are actually concerned for their images, but not that much for the existing proprietary databases. Therefore, by storing the data+organization in the images, it would increase the practical efficiency of backup.

And perhaps the most important point: there is no drawback (as it would be a completely new and complementing solution to existing systems)...
...except that it needs to be implemented and coded... :mrgreen:


Although the organization of your files can appear at first to be a trivial matter, it could become quite a challenging task to recreate it after several years. And for advanced/pro photographers with hundreds of categories and tens (or hundreds) thousands of pictures, it is critical.

Olivier
User avatar
Olivier_G
XnThusiast
Posts: 1423
Joined: Thu Dec 23, 2004 7:17 pm
Location: Paris, France
Contact:

An example

Post by Olivier_G »

I just thought of an interesting example to illustrate the advantage of such a system when adopted as an exchange protocol...

Let's say that I am a working photographer dealing with Stock Agencies. I already have my own categories system.
The Agency "A" has built up its own categories as well, and now asks all the photographers to use the same categories in order to populate directly the Stock.
Solution:
We all use our name as top category ("O.G." for me, "Agency A" for the Agency, etc...) and photographers import the hierarchy branch of the Agency. I just organize my pictures as usual, and take into account the new categories from the Agency. When I send pictures to the Agency, they will appear directly in the correct categories of their organization system.
=> Fast setup, and not a single action needed then... 8)
(I can also use the system to know to which Agencies my photos are dispatched)

The logic is the same for any project involving many persons.
In a family, for example, it will be quite easy to organize photos depending on the member of the family... and keep the organization when one member gets its autonomy (own computer...). Usefull between different computers (replications, selective backups, etc...) as well as between different softwares.

I have to say that I just find all this pretty exciting... :D

Olivier
User avatar
Olivier_G
XnThusiast
Posts: 1423
Joined: Thu Dec 23, 2004 7:17 pm
Location: Paris, France
Contact:

Update

Post by Olivier_G »

In order to discuss further this idea and get some feedback about the next IPTC release, I joined this Discussion group: http://groups.yahoo.com/group/controlledvocabulary

Very interesting:
- The next version, "IPTC Core v1.0" based on XMP, should be published very soon with specifications (January... ?).
- The 'Category' and 'Supplementary Categories" fields will be removed.
- However, compatibility with the Legacy IPTC (IIMv4.1) will be maintained.

The "IPTC Core" may allow some Custom fields... maybe without any limit on the number of characters. That would be the perfect way to implement the first solution (ie: complete hierarchical path), which is the best option so far. :D

Pierre: what do you think of all this ? Could it be interesting for XnView (free/Deluxe) ?

Olivier
PS: You may be interesting in joining this previous discussion group, as well as this one (http://groups.yahoo.com/group/iptc4xmp/)
User avatar
xnview
Author of XnView
Posts: 46236
Joined: Mon Oct 13, 2003 7:31 am
Location: France
Contact:

Re: Update

Post by xnview »

Olivier_G wrote:In order to discuss further this idea and get some feedback about the next IPTC release, I joined this Discussion group: http://groups.yahoo.com/gro
Pierre: what do you think of all this ? Could it be interesting for XnView (free/Deluxe) ?
Ok, i'll check this group.
Pierre.
User avatar
Olivier_G
XnThusiast
Posts: 1423
Joined: Thu Dec 23, 2004 7:17 pm
Location: Paris, France
Contact:

Post by Olivier_G »

I believe that 'hierarchical categories' is still a hot topic in Image Management, therefore I will continue this thread.

IPTC Core 1.0 has been published 9 months ago and has since gained support (Photoshop, Pixvue, iView, etc...). It also helped to widespread XMP as well.

Now...
- IPTC-IIM(legacy) is obviously not the way to go (you can forget my proposed "|AAA.etc..." scheme :D ).
- Even 'IPTC Core' should probably be avoided as end-users' own specific organization needs are way beyond the IPTC mission statement.
So what is left? ...XMP!

=> 'Hierarchical Categories' could be implemented as a simple XMP item and freely be used/shared as described before.
(and some import/export could be used with IPTC categories/keywords if necessary)

It would be very interesting to implement it in XnView (along with a new Image Management interface).
Is anyone interested in this idea? Or has any experience with XMP implementation?

Olivier
User avatar
Olivier_G
XnThusiast
Posts: 1423
Joined: Thu Dec 23, 2004 7:17 pm
Location: Paris, France
Contact:

Post by Olivier_G »

This thread goes along Thoughts about Categories Management, but deals with all the 'behind the scene' stuff and how to implement it.

More details:
- XnView would use an internal DataBase (files+categories) for instant response.
- XMPs would then be processed in the background. An internal list would keep track of progress, so that the user can postpone/resume the operation (ie: close XnView, etc...) or to resume after a crash.
- Exceptions should be managed carefully (ex: what if files are deleted/moved/renamed/modified by another program? -> see with user: cancel, change target, save XMP data...)
- There would be two types of elements: 'Category' (related to IPTC keywords/categories) and 'Workflow' (such as as Priority, Ratings, Status...)
- You would be able to set how IPTC data is related: 'none'/'specified category alone'/'include all categories above' + 'merge with existing data'/'keep it separate' for Category and Priority+Status for Workflow


An exemple...
I put several images of an eagle flying above a lake at sunset in the following categories:
- Landscape > Sunset
- Landscape > Lake
- Animal > Bird > Eagle

The internal DB will manage instantly the hierarchy of categories, the categories affected to the files and their location on the disk.
Then, XnView will write the XMP data for those files in the background (something like "|Landscape|Sunset|;|Landscape|Lake|;|Animal|Bird|Eagle|").
Because I selected 'include all categories above' for IPTC, it will also add the keywords "Landscape";"Sunset";"Lake";"Animal";"Bird";"Eagle" to the IPTC of those files.

If I move the branch 'Bird>Eagle' to the node 'Landscape>Sunset' (just an example... dont look for a real meaning here), all files in the Bird category or below will be updated:
- in the XMP: change "|Animal|Bird|" to "|Landscape|Sunset|Bird|"
- in the IPTC, based on the settings, remove "Animal" and add "Landscape";"Sunset" (+avoid doublons and check removal with other categories)
If I change then the IPTC option from 'include all categories above' to 'specified category alone', all files will need an update (here: remove "Landscape"; "Bird", to keep "Sunset"; "Lake"; "Eagle").

The management of existing IPTC keywords has to be carefully managed, especially when importing them into the hierarchical XMP model (propose matching hierarchical categories, create a new category, rules for selections, etc...). This is probably the most complex part... (and is not necessary in the first step XMP 'Hierarchical Categories').

This being said: finding and implementing the proper solution may be complex... but using it should be easy.

Olivier
Xyzzy
Posts: 652
Joined: Tue Nov 23, 2004 10:17 pm
Location: Poland

Post by Xyzzy »

Frankly speaking, I'd like XnView to stay image viewer with some added image editing features.

If you want powerful and flexible image management program, I'd suggest iMatch I personally use. Look into it, really.

IMO better polish and add image viewing/manipulation features than bloat XnView with database engine.

X.
User avatar
Olivier_G
XnThusiast
Posts: 1423
Joined: Thu Dec 23, 2004 7:17 pm
Location: Paris, France
Contact:

Post by Olivier_G »

Xyzzy wrote:Frankly speaking, I'd like XnView to stay image viewer with some added image editing features.
Should we remove from XnView: Browser, Thumbnails DB, Batch Converting/Renaming, IPTC, SlideShow, Webpage, Contactsheet, Filters, etc... ?
If you want powerful and flexible image management program, I'd suggest iMatch I personally use. Look into it, really.
Did you notice how iMatch is bloated with Viewer, Browser, Editing, SlideShow, Webpage, ContactSheet, etc... ? :D

It may actually be possible to add a good Image Management to XnView with a 35KB increase... instead of having to get that 35MB download monster!!!
For the first part, only a small internal DB for handling categories would be needed, along a tab in the TreeView panel to show categories and handle drag'n drop. That is quite a moderate addition (handling IPTC and XMP should come later).

I really believe Image Management doesn't need to be a huge separate thing... and that it (with some good design thinking) can be light and easy to use. Something that hasn't been made so far, yet...

Olivier
Xyzzy
Posts: 652
Joined: Tue Nov 23, 2004 10:17 pm
Location: Poland

Post by Xyzzy »

Olivier_G wrote:
Xyzzy wrote:Frankly speaking, I'd like XnView to stay image viewer with some added image editing features.
Should we remove from XnView: Browser, Thumbnails DB, Batch Converting/Renaming, IPTC, SlideShow, Webpage, Contactsheet, Filters, etc... ?
They are strictly related to viewing/presenting images, aren't they? You should rather ask why image editing features shouldn't go. For me, all filters and adjustments can.
Olivier_G wrote:Did you notice how iMatch is bloated with Viewer, Browser, Editing, SlideShow, Webpage, ContactSheet, etc... ? :D
Image manegement app is supposed to have image viewing/presentation capabilities and NOT vice versa.
Olivier_G wrote: It may actually be possible to add a good Image Management to XnView with a 35KB increase... instead of having to get that 35MB download monster!!!
iMatch is programmable database engine suited for image BLOBs. It is not supposed to be default Windows image viewer.
Olivier_G wrote: For the first part, only a small internal DB for handling categories would be needed, along a tab in the TreeView panel to show categories and handle drag'n drop. That is quite a moderate addition (handling IPTC and XMP should come later).
I really believe Image Management doesn't need to be a huge separate thing... and that it (with some good design thinking) can be light and easy to use. Something that hasn't been made so far, yet...
Some of these features were reserved fo Deluxe not so long ago, that's one thing. Second one- you HOPE that this is possible or you KNOW it? Fe. do you have any concept of speedy, light on resources database? If we want good feature instead of very basic one (and everyone complaining about lack of options), we should consider editable XML schemas for database, removable drives support etc, etc.

I'd rather see high quality rendering engine rewritten for speed.

And let me say again- I do NOT want image management in XnView, especially neither mediocre nor bloating application.

X.
User avatar
Olivier_G
XnThusiast
Posts: 1423
Joined: Thu Dec 23, 2004 7:17 pm
Location: Paris, France
Contact:

Post by Olivier_G »

Xyzzy wrote:And let me say again- I do NOT want image management in XnView, especially neither mediocre nor bloating application.
I think I've got the message... :mrgreen:


Then... let's look how to get that good and fast/light Image Management into our beloved XnView.
(See the difference? I don't start by thinking it's impossible... I rather say "Ok... Who is interested? Then... let's see what we can get")

Olivier
Post Reply