Tuesday 28 May 2019

RE: mlocate - what is it good for?

Hi,

While there are lots of e-mails on this thread,
I decided to reply to this one.

I use locate several times per day on every linux
computer I use, and I often manually update the database,
including after a new machine startup (VM, for example).

I consider it to be one of the most useful commands available.
But yes, could install the package myself if I have to. However,
I will forgot this discussion and go through the "Oh ya, I have to
Install it now" moment that someone mentioned.

On 2019.05.24 12:02 Steve Langasek wrote:
> On Thu, May 23, 2019 at 03:13:49PM -0400, Little Girl wrote:
> Steve Langasek wrote:

>> I don't think the benefit of having locate available by default
>> justifies the daily disk thrashing / energy usage on every Ubuntu
>> machine everywhere.

>> Just curious, but how bad (or excessive or whatever) is this disk
>> thrashing / energy usage that you mentioned?

> It is difficult to answer this in aggregate because the data is hard to come
> by, and it definitely varies by type of disk and by number of files on the
> system. You can get a sense of the effect on your own system by running
> e.g. 'sudo iostat /dev/sda 30' in one window and 'sudo
> /etc/cron.daily/mlocate' in another. On my otherwise-idle desktop with an
> SSD, that only takes 3 seconds to complete and it only reads a few hundred
> MB of data off the disk (in order to open every directory and stat every
> file).
>
> I only have about 1.5 million files on my disk.
>
> On machines with a lot more files; or machines with spinning disks instead
> of SSDs; or heavily loaded servers, the impact is bound to be much higher
> that 3 seconds of I/O.

The "daily disk thrashing" is a very strong function of memory caching.

Example 1: My main gateway/router/firewall Ubuntu server, with magnetic disk:

Number of files: ~ 0.80 million.
Time to update database starting from flushed memory: 10.7 seconds
Time to update database starting from approximately fully cached directory stuff: 0.5 seconds.
Actual time for the cron daily mlocate task to run last night: 1 second.
Locate time (entire disk): 1.1 seconds
Find time (entire disk): 26.7 seconds

Example 2: My main Ubuntu test server, with magnetic disk:
Number of files: 4.3 million and 6.3 million, after adding 2 million for testing.

Time to update locate database starting from flushed memory:
After adding 2 million files: 12 minutes 45 seconds.
With no/minimal file changes: 10 minutes 50 seconds.

With no/minimal file changes and lots of cached directory stuff: 4.2 seconds.
Time to update database starting from lots of cached directory stuff:
After adding 2 million files: fast (did not write it down)
With no/minimal file changes: 4.2 seconds.
Actual time for the cron daily mlocate task to run last night: 6 seconds.
Locate time (entire disk): 2.7 seconds
Find time (entire disk, memory flushed): 17 minutes 42 seconds
Find time (entire disk, lots of cached directory stuff): 14.2 seconds.

For "energy usage": I think it is minimal and well worth it.

My main test server processor seems to take about 1 extra watt while
the locate database update task is running. (I do have a good mains
power data logger, but didn't hook it up for this.) So, for
last nights 6 second run time that is 6 Joules of extra processor
power. Guess at 3 times that power back at the mains for 18 Joules.
(even with no caching it's < 2000 Joules for the 11 minutes.)

... Doug

--
ubuntu-devel mailing list
ubuntu-devel@lists.ubuntu.com
Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/ubuntu-devel