How to blow up your entire Catalog with one click

Ideas for improvements and requests for new features in XnView MP

Moderators: helmut, XnTriq, xnview

Post Reply
User avatar
m.Th.
XnThusiast
Posts: 1676
Joined: Wed Aug 16, 2006 6:31 am
Contact:

How to blow up your entire Catalog with one click

Post by m.Th. »

If you have valuable Categories / Assignments - generally items tied to On-Database Photo Management DO NOT do the following!

It is a bug which wipes your entire catalog.


Steps to reproduce:
1. Go to DB Manager (Tools > Settings > Cache DB...)
2. Click "Clean Thumbnails"
3. Wait (no warning appears - just a gauge) and leave the program to do its "work"
4. When it is finished, restart the program.

Enjoy! Your catalog is gone! :-)
Image

Image

In fact this bug was reported several times here, but because it is a critical data-loss bug (a showstopper), I raise it again with a fix.

How to fix it

For the users:

Do not press that button! (yet) :(

For Pierre:

There are several approaces, but I think that the one with the most benefits (fast & other enhancements) is the following:
NOTE: I marked the secondary enhancements which can be easily implemented by fixing this bug in a color like this.

Add in table Images a field called 'TagsCount' of type Integer. Fill it by default with 0 (zero).

Add to the table TagsTree the following triggers (not tested):

Code: Select all

Create  Trigger MAIN.TagsTree_Images_AI AFTER INSERT On TagsTree
--when
begin
  update images set TagsCount=TagsCount+1 where images.IMAGEID=NEW.IMAGEID;
end;

Create  Trigger MAIN.TagsTree_Images_BD BEFORE DELETE On TagsTree
--when
begin
  update images set TagsCount=TagsCount-1 where images.IMAGEID=OLD.IMAGEID;
end;
Of course I assume that due of GUI layout and program logic, you do not have updates on that table, only inserts and deletes. But if you do, you need yet another trigger.

With the above triggers, you will have always the number of Categories (keywords) assigned to each image in real-time.
The speed penalty is towards to 0 since Images.ImageID is indexed and usually one assigns a keyword to relatively few images simultaneously.

Also, there is another feature here:
Based on this, you can also display in the thumbnail's corner the number of the keywords assigned, putting number 0 (ie. not yet categorized) with another color (the IDImager's approach) or, simply, just put a small icon in a corner of the thumb when image has keywords( ie TagsCount<>0) like in Lightroom.

But let's go back to our bug.

Since we have now all the data which we need in the Images table then we can simply do:

Code: Select all

DELETE FROM IMAGES WHERE RATING=0 AND COLOR=0 AND TagsCount=0;
Now everything is ok if we assume that the Tag Checkboxes are a temporary marking and we will not check if a photo is tagged or not. When we'll implement the albums then we'll talk again - we can do a JOIN between these tables. Also here we can do a JOIN but I preffer to have a field for the Categories because I pressume that many (will want to) use them. Also, if someone wants to save the tagged files also, please raise the hand.

The only thing which is left are the empty folder entries in the Folders table.

In order to remove them write a separate method/function and call it. Do like this because the very same problem (empty directories left in the database) appear also after Move and Delete operations. For your knowledge (only of) the fastest queries to retreive the empty folders is:

Code: Select all

select FOLDERID from folders where not exists (select IMAGEID from images where images.FOLDERID = folders.folderID limit 1);
...but of course you can use directly the DELETE in order to wipe all the empty folders through a single command:

Code: Select all

DELETE FROM folders 
WHERE 
  folderid IN 
    ( 
      SELECT 
        FOLDERID 
      FROM 
        folders 
      WHERE 
        NOT EXISTS 
          ( 
            SELECT 
              IMAGEID 
            FROM 
              images 
            WHERE 
              images.FOLDERID = folders.folderID 
            LIMIT 
              1 
          ) 
    ) ;
Don't be afraid. It executes pretty fast (0,006 secs on my machine having ~ 23000 images in the catalog - yes, I have a backup of it :D )

Also, be sure that the above DELETE to put in the DB Manager under a button called "Clean empty directories" or similar.

----
For the fist time run with the new schema, the users will have the new TagsCount field but will be filled with 0 (zero) even if they have files with keywords (categories) assigned. IOW, our persistent aggregate field is out of sync. That's why we need to fill it. We can update the field with the following (not tested)

Code: Select all

update Images set TagsCount=(Select Count(*) From TagTree where TagsTree.ImageID=Images.ImageID);
Of course, also this code you can keep around somewhere, you never know when you need to (re)sync your materialized aggregate field again.

Just my 2c & HTH

PS: Under the 'Clean Thumbs' button, even if the new algorithm above is much much faster, perhaps is better to put a message before, something like "We will do blablabla... and it will take a little. Confirm?" and keep the moving gauge there.
m. Th.

- Dark Themed XnViewMP 1.7.1 64bit on Win11 x64 -
User avatar
m.Th.
XnThusiast
Posts: 1676
Joined: Wed Aug 16, 2006 6:31 am
Contact:

Re: How to blow up your entire Catalog with one click

Post by m.Th. »

Re-read the above message! It was edited!

I wrote in the beginning

DELETE FROM IMAGES WHERE RATING<>0 AND COLOR<>0 AND TagsCount<>0;

which is obviously wrong. It should read

DELETE FROM IMAGES WHERE RATING=0 AND COLOR=0 AND TagsCount=0;

Now it is corrected.

Silly me! :(

Sorry!
m. Th.

- Dark Themed XnViewMP 1.7.1 64bit on Win11 x64 -
User avatar
xnview
Author of XnView
Posts: 46257
Joined: Mon Oct 13, 2003 7:31 am
Location: France
Contact:

Re: How to blow up your entire Catalog with one click

Post by xnview »

I don't understand, you have already an icon when a file has categories...

And what's the problem??
'Clean thumbnails' remove all thumbnails from database
The problem is with slow function?
Pierre.
User avatar
m.Th.
XnThusiast
Posts: 1676
Joined: Wed Aug 16, 2006 6:31 am
Contact:

Re: How to blow up your entire Catalog with one click

Post by m.Th. »

xnview wrote:I don't understand, you have already an icon when a file has categories...

And what's the problem??
'Clean thumbnails' remove all thumbnails from database
The problem is with slow function?
Yes, I didn't saw that there is an icon when a file has categories. My mistake. But the problem isn't there.

In fact there are more problems.

First, is the button label: "Clean Thumbnails"

In the world of photo managers nowadays there are TWO databases (Ok, there are programs with more than two but let's keep the things simple).

These TWO database are: Thumbnails database and Catalog database.

It really doesn't matter in our discussion that these two can be in practice in the same file. The main thing is that they are different because:

- Thumbnails are generated and hence you can rebuild them (more or less) automatically.
- The Catalog (ratings, colors, categories assigned to the photos) are hand made, entered with hard work and patience by humans at the keyboard looking at the photos. Nobody can "automatically regenerate" them.

Hence if the Thumbnails database gets lost, well, it isn't something very nice, but one can rebuild it. FTR, we regenerated several times a thumbs db containing approx. half a million thumbs.

But if the Catalog gets lost, then it is a catastrophe. Months of work in organizing, rating, thinking how to layout your keyword tree - all are gone. All your value, all your work is in the catalog, not in the thumbs.

That's why it is inadmissible to have a button called "Clean Thumbnails" to delete your entire catalog!

That's why I changed the algorithm in order to delete only the thumbs which DOES NOT have Catalog entries.

So, I think that now we can do the following:

Rename the 'Clean Thumbnails' button in 'Create new Database...'
and when one clicks on it, a BIG warning message should appear "This will empty your entire database. Are you sure that you want to continue?" - also perhaps here is much better to give the possibility to save the db elsewhere... Also, istm that emptying the DB is taking way longer than it should. There are much faster ways? Are you interested? :D

Also, you have there an "Optimize" button. When one clicks on it, instead of doing your present code it will open a form like this:
myImage(2).png
myImage(2).png (15.41 KiB) Viewed 935 times
When one presses the "Ok", the program will execute the checked maintenance tasks.
- 1st checkbox is your old code (which I suppose that is the 'Vacuum' command)
- 2nd - remove the empty directories - folders which have 0 entries in 'Images' table. Algorithm described in my 1st post in this thread.
- 3rd - the algorithm from my first post in this thread. But (perhaps) without the 'TagsCount' field. I think that it can be done in one single SQL command, even if it will be a little bit slower, but usually, users can wait in such situations.
- 4th and 5th... surprise! :) if you're interested I can give you a hint how to do it faster.

Everything is clearer now?
Comments?
Suggestions?
m. Th.

- Dark Themed XnViewMP 1.7.1 64bit on Win11 x64 -
User avatar
xnview
Author of XnView
Posts: 46257
Joined: Mon Oct 13, 2003 7:31 am
Location: France
Contact:

Re: How to blow up your entire Catalog with one click

Post by xnview »

m.Th. wrote: That's why it is inadmissible to have a button called "Clean Thumbnails" to delete your entire catalog!
But currently it's not like that :?:
- 1st checkbox is your old code (which I suppose that is the 'Vacuum' command)
Currently optimize check for orphaned files (check file system + vacuum)
- 2nd - remove the empty directories - folders which have 0 entries in 'Images' table. Algorithm described in my 1st post in this thread.
- 3rd - the algorithm from my first post in this thread. But (perhaps) without the 'TagsCount' field. I think that it can be done in one single SQL command, even if it will be a little bit slower, but usually, users can wait in such situations.
ok
- 4th and 5th... surprise! :) if you're interested I can give you a hint how to do it faster.
So almost like 'optimize', must check file system??
Pierre.
User avatar
m.Th.
XnThusiast
Posts: 1676
Joined: Wed Aug 16, 2006 6:31 am
Contact:

Re: How to blow up your entire Catalog with one click

Post by m.Th. »

xnview wrote:
m.Th. wrote: That's why it is inadmissible to have a button called "Clean Thumbnails" to delete your entire catalog!
But currently it's not like that :?:
Something fishy is going on. I retested with a new, small db and everything is ok. However, the first time, whith a more complex db I "succeded" to blow it in the air. Hmmm.... could you try to put the procedure between Begin/End Transaction and if any error appears to rollback the things? I will send you a PM with this.
- 1st checkbox is your old code (which I suppose that is the 'Vacuum' command)
Currently optimize check for orphaned files (check file system + vacuum)
Hmmm... mhmmm... mmmm... I think that's too much in one pass. :?
  • 1. There is something which is very very fast and can bring very good optimisation: Missing (orphan) File Directories Check. The scan for these is quick and delete from DB is also very quick. And you can delete a bunch of files. Hence this is an optimisation which must be run often.

    2. Then there is something which is medium-low speed but can bring good optimisation but only if you know that there are such cases: Missing (Orphan) Files Check. The user will run this only if he knows that from existing directories are files which are deleted/moved from outside of program.

    3. And third, there is VACUUM. A slow process with a disputable (low) benefit, especially with the advent of hybrid HDDs, SDDs, HDDs with big cache (16MB or so). Perhaps is better to know that in disk storage layer the most expensive thing is aquiring space. That's why enough DB engines make from the begining big, empty files for their DBs - some of them having a quite powerful syntax for this (eg. Oracle). That's why, usually, the databases do not shrink and in order to shrink a database you must do a more complicated procedure (ie. taking the database offline + a backup/restore cycle). Also, in OLTP/data aquisition applications there's a technique in which the app when is idle or in low-stress periods of activity, fills on purpose the tables with 'empty' data in order to aquire space.
That's why I broken them in three parts: Orphan Directories Check, Orphan FIles Check and Optimize (Vacuum)
- 2nd - remove the empty directories - folders which have 0 entries in 'Images' table. Algorithm described in my 1st post in this thread.
- 3rd - the algorithm from my first post in this thread. But (perhaps) without the 'TagsCount' field. I think that it can be done in one single SQL command, even if it will be a little bit slower, but usually, users can wait in such situations.
ok
For the 3rd - it seems that the 'Clean Thumbs' (somewhat) works but it isn't optimal. As I said, I will send you a special PM for this.
- 4th and 5th... surprise! :) if you're interested I can give you a hint how to do it faster.
So almost like 'optimize', must check file system??
Yes. But, as I said above, in different phases because the user can choose to check only the directories which is much more efficient than files. Also, I put an edit there in order to enter the root path for the check. This is more important than it seems at the first sight. It has two main uses:
  • 1. To restrict the area of searching where the user knows that there it might be something to find. (Hence before start scanning the file system, is better to do a 'select pathname from folders where pathname like '+myRootFolder+'%' - beware at myRootFolder to be properly formated (with the proper \ (slash) and not /, also having a slash at the end) and search only these folders. Also, take care at the root folder to have the proper case (LIKE is case sensitive for non-ASCII characters). For this you check the existence of the folder and take from disk the actual name. Anyway, you must do this. If the root folder doesn't exist, the entire process stops.

    2. To restrict the area of searching where the user knows that there aren't pseudo-offline files. For example I have in catalog files from network. Files from external SATA disks. What if the engine will not find the files/directories because the server is offline or the disk is out and decides to delete them?
m. Th.

- Dark Themed XnViewMP 1.7.1 64bit on Win11 x64 -
Post Reply