DB Thumbnails

Ideas for improvements and requests for new features in XnView MP

Moderators: XnTriq, xnview

User avatar
xnview
Author of XnView
Posts: 31918
Joined: Mon Oct 13, 2003 7:31 am
Location: France
Contact:

DB Thumbnails

Post by xnview » Wed Jun 19, 2013 1:00 pm

Currently when you change thumbnail size, XnViewMP will recompute all thumbnails

What do you think if XnViewMP will not recompute thumbnails, and use current stored thumbnail. And only if you use 'rebuild thumbnails', so XnViewMP will rebuild full list?
Pierre.

User avatar
oops66
XnThusiast
Posts: 1999
Joined: Tue Jul 17, 2007 1:17 am
Location: France

Re: DB Thumbnails

Post by oops66 » Wed Jun 19, 2013 1:07 pm

... I agree
XnViewMP 0.82 Linux X64 - Ubuntu 16.04 LTS - X64

User avatar
JohnFredC
XnThusiast
Posts: 2010
Joined: Wed Mar 17, 2004 8:33 pm
Location: Sarasota Florida

Re: DB Thumbnails

Post by JohnFredC » Wed Jun 19, 2013 5:08 pm

Yes, I also agree.

In the forum's past, there have been many requests for an "on the fly" resizing of the thumbs that did not rebuild the thumb image stored in the db (Mr. Librarian?). Suggestions have included both 1) simply rescaling the thumbs and/or 2) storing more sizes in the db.

The problem with resizing w/o resampling is aliasing of the larger thumbs. Reducing the thumbs size does not present as much of a problem.

However, storing very large thumbs in the db is problematic for db file size and access. The same issue (db size and speed of access) arises when storing multiple thumbs in the db for each image.

Some ideas:
  • 1. Always create and store (in the db) a thumb size somewhat larger than the requested size. This will help with the aliasing.
    2. In thumb options, ask the user what size thumb (Large, Medium, Small) to store by default in the db. This will help with the db size.
    3. After the user has resized the thumbs (via the slider), ask user whether to "Save new thumb size for this folder only? All folders?" This will prevent resampling when the user doesn't want it.
    4. If the user has requested resampling "for this folder only" then preserve the value and re-use it for new images in that folder (only).
The important thing is not to automatically invalidate every thumb in the db when the user slides the thumb size control in one folder.

PS.

A work-around (the one I employ) for the current versions of XnView is to keep two or more dbs and switch between them via the INI command-line argument. For instance I customarily keep two XnViews open "Side-By-Side", each pointing to a different INI file and different db files, one with small thumbs and one with large thumbs. The "large" thumbs db only keeps thumbs for a few important folders, so db size doesn't become a problem.

PPS.

Another relevant topic here.
John


User avatar
m.Th.
XnThusiast
Posts: 1560
Joined: Wed Aug 16, 2006 6:31 am
Contact:

Re: DB Thumbnails

Post by m.Th. » Thu Jun 20, 2013 8:40 am

xnview wrote:Currently when you change thumbnail size, XnViewMP will recompute all thumbnails

What do you think if XnViewMP will not recompute thumbnails, and use current stored thumbnail. And only if you use 'rebuild thumbnails', so XnViewMP will rebuild full list?
Yes, generally. But with some improvements.

1. Add and set as default WebP Lossy High Quality compression (let's say 80 compression) for the thumbnails in Cache DB. Besides that, Add also in Settings | Browser | Cache DB the WebP Lossy Medium-Low Quality compression. There are many advantages of WebP over JPEG - see here: https://developers.google.com/speed/webp/ also (not noted there) WebP has explicit support for parallel decoding (ie. faster rendering) and uses fixed-point maths (faster encoding - thumbnail generation). Did some tests and a 300x200 High Quality WebP thumbnail (quality: 80) has an average of 10 kB while JPEG (quality: 85) clocks at 14 kB.

2. When a bigger thumbnail exists than the requested size do not recompute it automatically. Recompute only if, let's say, is 400-500% bigger than the requested size.

3. When the thumbnail is smaller than the requested size in a certain thresold percentage (let's say till 50% - my opinion - ...or 33%? - other opinions?) do not recompute it. Resample it on-the-fly with a fast and good kernel and display it (remember that only few thumbnails will be rendered hence it is neat to have a good kernel in order to minimize the rebuilds). If the thumbnail is smaller than that thresold rebuild it (ie. if it is smaller than 50% from the requested size then the aliasing artifacts will appear, so recompute it).

Also, be aware that most programs have at least TWO (or more) databases: one for internal (meta)data/searching and one for thumbs. There are various reasons for that, one of the main ones and most straightforward are:

1. (at least) two threads can access them. AfterShotPro has option to specifiy how many threads to allocate for thumbs writting.
2. They can reside on separate HDD/SSD. For ex. Oracle uses this technique for huge performance gains. In fact is normal: especially for HDD the seek time increases significantly if one has two (or more) threads accessing the very same disk. Ok, I know, Oracle isn't a DAM but we're discussing here DB architecture.
3. These databases can have different "database" engines. For ex. Lightroom stores the index/searching metadata in a SQLite db (a behemoth, believe me) and the thumbs directly on the disk, each thumb in a .lrprev file. IDImager uses two databases - the search db could be a networked db (namely Microsoft SQL Server) hence it gives "for free" multi user access. The thumbs are stored in a SQLite db which can be shared. Ok, SQLite doesn't support multi-user but many users can open it and see as it was in the moment in which it was opened. Very small limitation, if you ask me.

Hence, besides the above improvements about resizing, I'd vote for the database split. Any other opinions?
Last edited by m.Th. on Sat Jun 29, 2013 8:08 am, edited 1 time in total.
m. Th.

The Ascetic Experience - The best photos and texts from Holy Mountain (Athos)

- Dark Themed XnViewMP 0.90 64bit & XnView 2.00 x64 on Win7 x64 -

User avatar
oops66
XnThusiast
Posts: 1999
Joined: Tue Jul 17, 2007 1:17 am
Location: France

Re: DB Thumbnails

Post by oops66 » Thu Jun 20, 2013 9:37 am

m.Th. wrote:...Hence, besides the above improvements about resizing, I'd vote for the database split. Any other opinions?
Hello,
1. If it's better and smaller to store, WebP is a good idea.

2. I agree for the ability to move (as option) the 2(or more) db files, (for ex: xnview-thumbnail.xndb & Xnview-meta.xndb) to an other HDD/DD or a fast usb3 key.
... So, in this case, maybe with a "base path" to configure first (and relatives paths stored into the BDs ?) for a better portability ?

3. Yes automatically recompute only if the thumbnail quality is not enough (and maybe a custom user option for the db "thumbnail size to DB store" chosen as default).

So good idea, I also vote for the database split and movable too ... (so more threads can be used, both or only one db can be moved to an external disk/usbkey , etc...)
... But maybe the load is to high for Pierre (with the others projects) ?
XnViewMP 0.82 Linux X64 - Ubuntu 16.04 LTS - X64

User avatar
xnview
Author of XnView
Posts: 31918
Joined: Mon Oct 13, 2003 7:31 am
Location: France
Contact:

Re: DB Thumbnails

Post by xnview » Tue Jun 25, 2013 3:58 pm

m.Th. wrote: Hence, besides the above improvements about resizing, I'd vote for the database split. Any other opinions?
So one database for all file entries, and one for thumbnails. What's the benefit? 2 threads can already access the thumbnail table...
If thumbnails are in another database, the current database will be smaller, perhaps good???
Pierre.

User avatar
oops66
XnThusiast
Posts: 1999
Joined: Tue Jul 17, 2007 1:17 am
Location: France

Re: DB Thumbnails

Post by oops66 » Tue Jun 25, 2013 4:11 pm

xnview wrote:
m.Th. wrote: Hence, besides the above improvements about resizing, I'd vote for the database split. Any other opinions?
So one database for all file entries, and one for thumbnails. What's the benefit? 2 threads can already access the thumbnail table...
If thumbnails are in another database, the current database will be smaller, perhaps good???
Hello,
... I think one of the main benefit by splitting these two kinds of BD datas is :
- Because one is "generated automatically" (the thumbnails process, the DB with the "low added value") and usually big
- The other one filled by human hand and "little fingers" ;-) (keywords, categories, rating, color labels ... ) , the DB with the "high added value" - and generally smaller (so easier to manage, to do a backup, etc...)

... And maybe it would be interresting to have inside these 2 DBs, in this case, not the full paths, but relatives paths for datas, with a "base path" to configure first (and relatives paths from this "base path" stored into the BDs) ... for a better portability ? (.. just the "base path" to change to view photo from a backup (images from a DVD), external hdd ...)
XnViewMP 0.82 Linux X64 - Ubuntu 16.04 LTS - X64

User avatar
m.Th.
XnThusiast
Posts: 1560
Joined: Wed Aug 16, 2006 6:31 am
Contact:

Re: DB Thumbnails

Post by m.Th. » Wed Jun 26, 2013 7:05 am

oops66 wrote:
xnview wrote:
m.Th. wrote: Hence, besides the above improvements about resizing, I'd vote for the database split. Any other opinions?
So one database for all file entries, and one for thumbnails. What's the benefit? 2 threads can already access the thumbnail table...
If thumbnails are in another database, the current database will be smaller, perhaps good???
Hello,
... I think one of the main benefit by splitting these two kinds of BD datas is :
- Because one is "generated automatically" (the thumbnails process, the DB with the "low added value") and usually big
- The other one filled by human hand and "little fingers" ;-) (keywords, categories, rating, color labels ... ) , the DB with the "high added value" - and generally smaller (so easier to manage, to do a backup, etc...)
Of course!. :) And not only. The "little fingers DB" ;-) (aka. the Catalog) must be very fast for the searches. And now is not. And (unfortunatelly) isn't your fault. Why? Let's write another small chapter from the "Free Crash DB Course for Pierre": :)

As you know, the DB files are organized in pages. In a classical DB engine (NOT SQLite) there are INTEGER, TEXT (ok, VARCHAR), FLOAT... ...and a bunch of other types with a fixed length. And there are BLOBs (BLOB, CLOB, MEMO etc. etc. etc.)... ...variable length types which usually are BIG. Usually much bigger than fixed length types. Hence, the DB engine keeps in so-called Table Pages all the fixed length types and for the BLOBs keep only a pointer to the first BLOB Page in which is dumped the entire content of the Blob (and if the Blob doesn't fit in a single Blob Page, in that page there is a pointer to the next Blob Page and so on). As you see, in a normal DB server, by reading just one Table Page, the engine is able to access and process a bunch of rows (hundreds or even thousands of rows) because the Blob(s) are out. That's why some DB engines doesn't allow sorting or even searching on BLOBs because these are just "binary" data (yes, there are some MEMOs or long VARCHAR which are searcheable and sortable but let us not complicate the things).

Unfortunatelly in SQLite the things aren't like this: The BLOBs are stored inline. :-( ...hence, even if they aren't accessed, they degrade the performance by filling the Table Page with their data. So, instead in a 16k page to have 1000s of records we'll have only one (!) or two (!).

See here:
http://stackoverflow.com/questions/1725 ... erformance

That's why in SQLite the BLOBs are the last fields (the engine is reading the fields from left to right) and that's why they're pushed outside of our "fast" tables or, even more, out of our "fast" db.

... And maybe it would be interresting to have inside these 2 DBs, in this case, not the full paths, but relatives paths for datas, with a "base path" to configure first (and relatives paths from this "base path" stored into the BDs) ... for a better portability ? (.. just the "base path" to change to view photo from a backup (images from a DVD), external hdd ...)
Oh, relocation! Yes! Of course! We need also to provide a tool for this in the 'DB Maintenance'.

In fact, a perfect DB architecture should look like this:
XnViewMP DB Layer.jpg
XnViewMP DB Layer.jpg (57.41 KiB) Viewed 3314 times
With both Vertical Partitioning (which we're discussed above) AND Horizontal Partitioning.

Let me explain about Horizontal Partitioning: ;)

Theoretically, we can use XnView to browse 'everywhere' and we have 'everywhere' photos on our storage(s) (local and LAN).

But in practice isn't like that.

Usually, users have One, Two (most of the times), Three or at most few root folders in which they store images. Most probably you imagined that, since you have in your DB Manager (which you put in Tools | Settings | Browser | CacheDB) an edit box called 'Base path of your pictures'.

However this isn't enough.

Because besides our Archive (one root folder) and the Excluded Folders (folders like C:\Temp\* etc.) there exists a "Gray Zone" with different folders on the same disk which we want to see (view) but not to search / organize (think different images related to other hobbies, secundar activities, temporary items etc.). And this Gray Zone is usually much bigger in Database footprint than our Archive. Hence we have already the need for two databases: one with high performance, data quality, high quality thumbnails but relatively few items (our "Archive" limited to one root folder - say D:\Archive\) and another one for the 'rest of the world'. In fact is the db which you have now - this db will be, of course, mandatory, and will have the least priority in the dispatcher's db list.

What is the db dispatcher?

For this to work you need to have a dispatcher (ok, a simple text list) which will look at the root paths of each db.
If the requested path to display will be a substring of one of the root paths, the the program will look in the corresponding db. Otherwise look in the 'rest of the world' db.

Perhaps you think now about the connection overheard. Besides the fact that SQLite has a very-ultra-super fast connection time (no authentication & authorization, network protocols etc.), the problem can be easily solved by using a 'connection pool' (see http://en.wikipedia.org/wiki/Connection_pool). Keep two (or more?) databases always open and use a pointer called, let's say, currentDB which you'll switch to the correct db only when it is needed. In fact, a similar concept can be found in thread programming.

I think it is clear. If not, pls tell me what do you want to clarify for you.

As an aside, enough times there are three (or more) root folders: 'Uncatecorized (Photo Inbox)', 'Archive' and 'rest-of-the-word' - this especially in situations when are involved (small) teams.
m. Th.

The Ascetic Experience - The best photos and texts from Holy Mountain (Athos)

- Dark Themed XnViewMP 0.90 64bit & XnView 2.00 x64 on Win7 x64 -

User avatar
xnview
Author of XnView
Posts: 31918
Joined: Mon Oct 13, 2003 7:31 am
Location: France
Contact:

Re: DB Thumbnails

Post by xnview » Wed Jun 26, 2013 8:32 am

m.Th. wrote: Of course!. :) And not only. The "little fingers DB" ;-) (aka. the Catalog) must be very fast for the searches. And now is not. And (unfortunatelly) isn't your fault. Why? Let's write another small chapter from the "Free Crash DB Course for Pierre": :)
Ok, so all BLOB data must not be in Catalog DB, right? What about Meta BLOB?
I think that first change can be to create a database specific to Thumbnails BLOB
Let me explain about Horizontal Partitioning: ;)
Ok, thanks for explanation :)
But seems complex, if you want to view all files for a specific category, you must open/check in all database :?: :shock:
Pierre.

User avatar
oops66
XnThusiast
Posts: 1999
Joined: Tue Jul 17, 2007 1:17 am
Location: France

Re: DB Thumbnails

Post by oops66 » Wed Jun 26, 2013 9:30 am

Hello,

First, thank you for the course ... by spending some time, nice graphics and energy to help us (xnview forumers)... and to help to improve the XnViewMP DB.

"I think it is clear. If not, pls tell me what do you want to clarify for you."

... Some are for me already clear, but some others not ... so It's probably clear but not only with one pass (for me) ... but the most important thing is than I am (we are) enthusiast ;-)

...So I will make a new read about the horizontal Partitioning: ;)

Edit: Ok ... so for you it's better to have 2 DBs (Catalog DB+ thumbs DB) per disk/partition/or network drive to meet "the root folders list" ? ... to build/use relatives paths into the db ... (But if the data come from a read only support like a DVD, it's not possible ?)
XnViewMP 0.82 Linux X64 - Ubuntu 16.04 LTS - X64

User avatar
m.Th.
XnThusiast
Posts: 1560
Joined: Wed Aug 16, 2006 6:31 am
Contact:

Re: DB Thumbnails

Post by m.Th. » Wed Jun 26, 2013 10:17 am

xnview wrote:
m.Th. wrote: Of course!. :) And not only. The "little fingers DB" ;-) (aka. the Catalog) must be very fast for the searches. And now is not. And (unfortunatelly) isn't your fault. Why? Let's write another small chapter from the "Free Crash DB Course for Pierre": :)
Ok, so all BLOB data must not be in Catalog DB, right? What about Meta BLOB?
I think that first change can be to create a database specific to Thumbnails BLOB
Yes, the first change is to create a db specific for Thumbnails.

2nd change (which can be done in parallel) is refactoring of the Meta BLOB. Basically, it should be divided in 3 (three) parts: An EXIF table, an IPTC table and the small remanider of informations not already spread in the two aftermentioned tables. BEWARE! The EXIF and IPTC tables will not have all the EXIF (or IPTC) info! Only the most searched ones and the most efficient ones - the ones which you list in the Search window (btw, in the EXIF info you forgot (among others) the Shutter Speed and Exposure Compensation which is a basic info). By 'efficient' I mean that generally we must avoid strings (which takes a lot of space) but for most necessary ones we have a solution.

That's why I've asked this: http://newsgroup.xnview.com/viewtopic.p ... 32#p112185 ...but it seems to be too deep in the thread in order to be noticed.

But for this I need some info from you about the Search dialog's menu under the button "Add >>":

1. There are some fields in IPTC submenu which reappear with the same name in the XMP submenu. They represent the same field or they can have different infos?
2. The Annotation field from the main menu reappears is the same with another field from IPTC and/or XMP submenus? Where do you store the data for this field? How can it be entered? From where it comes?
3. In the main menu and in the submenus there is a entry called "All Fields". This means all fields from the specific menu or all fields from ALL the EXIF/IPTC (whatever...) info knowing that there is much more EXIF/IPTC tags (data) than in your menu?
Let me explain about Horizontal Partitioning: ;)
Ok, thanks for explanation :)
But seems complex, if you want to view all files for a specific category, you must open/check in all database :?: :shock:
There are several solutions for that:

- Connection Pools (really easy to implement. It is a list of collections)
- Attach DBs http://www.sqlite.org/lang_attach.html
- Local Map / Reducing
- Query Clustering
etc.

...and it will be faster (especially if it runs in parallel) and isn't so complex. But about this later on.

Now, I think that's much better to concentrate on improving the only one DB which we have.
m. Th.

The Ascetic Experience - The best photos and texts from Holy Mountain (Athos)

- Dark Themed XnViewMP 0.90 64bit & XnView 2.00 x64 on Win7 x64 -

User avatar
m.Th.
XnThusiast
Posts: 1560
Joined: Wed Aug 16, 2006 6:31 am
Contact:

Re: DB Thumbnails

Post by m.Th. » Wed Jun 26, 2013 10:35 am

oops66 wrote:Hello,

First, thank you for the course ... by spending some time, nice graphics and energy to help us (xnview forumers)... and to help to improve the XnViewMP DB.

"I think it is clear. If not, pls tell me what do you want to clarify for you."

... Some are for me already clear, but some others not ... so It's probably clear but not only with one pass (for me) ... but the most important thing is than I am (we are) enthusiast ;-)

...So I will make a new read about the horizontal Partitioning: ;)

Edit: Ok ... so for you it's better to have 2 DBs (Catalog DB+ thumbs DB) per disk/partition/or network drive to meet "the root folders list" ? ... to build/use relatives paths into the db ... (But if the data come from a read only support like a DVD, it's not possible ?)
(red color and bold are mine)

Yes, one of the goals having multiple DBs (ie "Horizontal Partitioning") are the relative paths which give the possiblity of Relocation. Yes, it is possible with DVDs. It is possible with external USB Disks or any other disk which goes on different letters. Because except the path we will record the 'label' of the root - MediaName how IDImager puts it.

We have some cases here: If the path is a local one then we get the serial number of the drive/thumb. If the path is a network share then we get the server name (for ex. from \\myServer\myShare\Archive we'll get myServer).

Another reason, which is the biggest one is the Sustainable Database Speed. One of the big problems with photo managers today is that their catalogs grow, grow and grow... Almost all the companies have this big problem and because they encountered only when the program became huge and is difficult to change the entire codebase, they only provide kludges like "File | New Catalog..." and "File | Open Catalog...".

But we are at the beginning and I think that we can refactor a little the code in order to have multiple DBs. The thing is to have your high-class, high-value, catalogged (ratings, colors, keywords, albums etc.) DB relatively small and to not be cluttered (flooded) by thumbs from all over the places.

AfterShotPro works with multiple catalogs (in fact Search is the main feature which bothers the users) but it costs. In order to see a very nice example how to work with multiple DBs and how 'slow' it is - download Index Your Files from http://www.indexyourfiles.com/ which is btw a very nice utility to index your files. Add some databases and play with it. It is freeware.
except programs
m. Th.

The Ascetic Experience - The best photos and texts from Holy Mountain (Athos)

- Dark Themed XnViewMP 0.90 64bit & XnView 2.00 x64 on Win7 x64 -

User avatar
xnview
Author of XnView
Posts: 31918
Joined: Mon Oct 13, 2003 7:31 am
Location: France
Contact:

Re: DB Thumbnails

Post by xnview » Wed Jun 26, 2013 10:41 am

m.Th. wrote: 2nd change (which can be done in parallel) is refactoring of the Meta BLOB. Basically, it should be divided in 3 (three) parts: An EXIF table, an IPTC table and the small remanider of informations not already spread in the two aftermentioned tables. BEWARE! The EXIF and IPTC tables will not have all the EXIF (or IPTC) info! Only the most searched ones and the most efficient ones - the ones which you list in the Search window (btw, in the EXIF info you forgot (among others) the Shutter Speed and Exposure Compensation which is a basic info). By 'efficient' I mean that generally we must avoid strings (which takes a lot of space) but for most necessary ones we have a solution.
So keep Meta in the Catalog db?
Meta is not used a lot for search but for thumbnail labels, so i need to keep them like this...
1. There are some fields in IPTC submenu which reappear with the same name in the XMP submenu. They represent the same field or they can have different infos?
Search dialog can search info in IPTC or in XMP metadata
2. The Annotation field from the main menu reappears is the same with another field from IPTC and/or XMP submenus? Where do you store the data for this field? How can it be entered? From where it comes?
Annotation is the comment that you can add on a file (Edit>Set comment)
3. In the main menu and in the submenus there is a entry called "All Fields". This means all fields from the specific menu or all fields from ALL the EXIF/IPTC (whatever...) info knowing that there is much more EXIF/IPTC tags (data) than in your menu?
All fields known
Now, I think that's much better to concentrate on improving the only one DB which we have.
With 2 databases, no??
Pierre.

User avatar
oops66
XnThusiast
Posts: 1999
Joined: Tue Jul 17, 2007 1:17 am
Location: France

Re: DB Thumbnails

Post by oops66 » Wed Jun 26, 2013 11:24 am

m.Th. wrote:...
AfterShotPro works with multiple catalogs (in fact Search is the main feature which bothers the users) but it costs. In order to see a very nice example how to work with multiple DBs and how 'slow' it is - download Index Your Files from http://www.indexyourfiles.com/ which is btw a very nice utility to index your files. Add some databases and play with it. It is freeware.
except programs
...Ok, thank you for the explanations ... and the link.
XnViewMP 0.82 Linux X64 - Ubuntu 16.04 LTS - X64

Post Reply