oops66 wrote: xnview wrote:
Hence, besides the above improvements about resizing, I'd vote for the database split. Any other opinions?
So one database for all file entries, and one for thumbnails. What's the benefit? 2 threads can already access the thumbnail table...
If thumbnails are in another database, the current database will be smaller, perhaps good???
... I think one of the main benefit by splitting these two kinds of BD datas is :
- Because one is "generated automatically" (the thumbnails process, the DB with the "low added value") and usually big
- The other one filled by human hand and "little fingers"
(keywords, categories, rating, color labels ... ) , the DB with the "high added value" - and generally smaller (so easier to manage, to do a backup, etc...)
And not only. The "little fingers DB" (aka. the Catalog) must be very fast for the searches
. And now is not. And (unfortunatelly) isn't your fault. Why? Let's write another small chapter from the "Free Crash DB Course for Pierre":
As you know, the DB files are organized in pages. In a classical DB engine (NOT SQLite) there are INTEGER, TEXT (ok, VARCHAR), FLOAT... ...and a bunch of other types with a fixed length. And there are BLOBs (BLOB, CLOB, MEMO etc. etc. etc.)... ...variable length types which usually are BIG. Usually much bigger than fixed length types
. Hence, the DB engine keeps in so-called Table Pages all the fixed length types and for the BLOBs keep only a pointer to the first BLOB Page in which is dumped the entire content of the Blob
(and if the Blob doesn't fit in a single Blob Page, in that page there is a pointer to the next Blob Page and so on). As you see, in a normal DB server, by reading just one Table Page, the engine is able to access and process a bunch of rows (hundreds or even thousands of rows) because the Blob(s) are out. That's why some DB engines doesn't allow sorting or even searching on BLOBs because these are just "binary" data (yes, there are some MEMOs or long VARCHAR which are searcheable and sortable but let us not complicate the things).
Unfortunatelly in SQLite the things aren't like this: The BLOBs are stored inline.
...hence, even if they aren't accessed, they degrade the performance by filling the Table Page with their data. So, instead in a 16k page to have 1000s of records we'll have only one (!) or two (!).
http://stackoverflow.com/questions/1725 ... erformance
That's why in SQLite the BLOBs are the last fields (the engine is reading the fields from left to right) and that's why they're pushed outside of our "fast" tables or, even more, out of our "fast" db.
... And maybe it would be interresting to have inside these 2 DBs, in this case, not the full paths, but relatives paths for datas, with a "base path" to configure first (and relatives paths from this "base path" stored into the BDs) ... for a better portability ? (.. just the "base path" to change to view photo from a backup (images from a DVD), external hdd ...)
Oh, relocation! Yes! Of course! We need also to provide a tool for this in the 'DB Maintenance'.
In fact, a perfect DB architecture should look like this:
With both Vertical Partitioning (which we're discussed above) AND
Let me explain about Horizontal Partitioning:
, we can use XnView to browse 'everywhere' and we have 'everywhere' photos on our storage(s) (local and LAN).
But in practice
isn't like that.
Usually, users have One, Two (most of the times), Three or at most few root folders
in which they store images. Most probably you imagined that, since you have in your DB Manager (which you put in Tools | Settings | Browser | CacheDB) an edit box called 'Base path of your pictures'.
However this isn't enough.
Because besides our Archive (one root folder) and the Excluded Folders (folders like C:\Temp\* etc.) there exists a "Gray Zone" with different folders on the same disk which we want to see (view) but not to search / organize (think different images related to other hobbies, secundar activities, temporary items etc.). And this Gray Zone is usually much bigger in Database footprint than our Archive
. Hence we have already the need for two databases: one with high performance, data quality, high quality thumbnails but relatively few items (our "Archive" limited to one root folder - say D:\Archive\) and another one for the 'rest of the world'. In fact is the db which you have now - this db will be, of course, mandatory, and will have the least priority in the dispatcher's db list.
What is the db dispatcher?
For this to work you need to have a dispatcher (ok, a simple text list) which will look at the root paths of each db.
If the requested path to display will be a substring of one of the root paths, the the program will look in the corresponding db. Otherwise look in the 'rest of the world' db.
Perhaps you think now about the connection overheard. Besides the fact that SQLite has a very-ultra-super fast connection time (no authentication & authorization, network protocols etc.), the problem can be easily solved by using a 'connection pool' (see http://en.wikipedia.org/wiki/Connection_pool
). Keep two (or more?) databases always open and use a pointer called, let's say, currentDB which you'll switch to the correct db only when it is needed. In fact, a similar concept can be found in thread programming.
I think it is clear. If not, pls tell me what do you want to clarify for you.
As an aside, enough times there are three (or more) root folders: 'Uncatecorized (Photo Inbox)', 'Archive' and 'rest-of-the-word' - this especially in situations when are involved (small) teams.