Thursday 30 May 2019

Re: mlocate - what is it good for?

Hi all, I am a very inactive member of this list, for years I have been in the "read-only" contingent, but I feel like I should give my impression on this...

I think we shoud really distinguish if we are talking about multi-user installs, where there are a number of people logging in and using the command-line, or cloud-server scripted installs, where no more than one user with administrator capabilities is going to use the shell. If it is the former, I think including mlocate in the default install is the sanest decision, lest the weakest link in the chain, the shell user without the power to perform package installs, will be the one harmed. And Canonical can't avoid the effect of "officially de-endorsing" mlocate as a base system utility if they remove it from the default install: many institutions and businesses simply use the default install for their multi-user systems, and if mlocate is not there, it will just become one of thousands of "extra" utilities that the administrator could install or not, i.e., it will not be there if there is not a strong pressure for its use.

But of course, this is for multi-user servers. I can't see one such server suffering from the "multiple instances having the file database updated at the same time" because ideally it would not be simply a shared cloud instance.


On Tue, May 28, 2019 at 3:58 PM Doug Smythies <dsmythies@telus.net> wrote:
Hi,

While there are lots of e-mails on this thread,
I decided to reply to this one.

I use locate several times per day on every linux
computer I use, and I often manually update the database,
including after a new machine startup (VM, for example).

I consider it to be one of the most useful commands available.
But yes, could install the package myself if I have to. However,
I will forgot this discussion and go through the "Oh ya, I have to
Install it now" moment that someone mentioned.

On 2019.05.24 12:02 Steve Langasek wrote:
> On Thu, May 23, 2019 at 03:13:49PM -0400, Little Girl wrote:
> Steve Langasek wrote:

>> I don't think the benefit of having locate available by default
>> justifies the daily disk thrashing / energy usage on every Ubuntu
>> machine everywhere.

>> Just curious, but how bad (or excessive or whatever) is this disk
>> thrashing / energy usage that you mentioned?

> It is difficult to answer this in aggregate because the data is hard to come
> by, and it definitely varies by type of disk and by number of files on the
> system.  You can get a sense of the effect on your own system by running
> e.g. 'sudo iostat /dev/sda 30' in one window and 'sudo
> /etc/cron.daily/mlocate' in another.  On my otherwise-idle desktop with an
> SSD, that only takes 3 seconds to complete and it only reads a few hundred
> MB of data off the disk (in order to open every directory and stat every
> file).
>
> I only have about 1.5 million files on my disk.
>
> On machines with a lot more files; or machines with spinning disks instead
> of SSDs; or heavily loaded servers, the impact is bound to be much higher
> that 3 seconds of I/O.

The "daily disk thrashing" is a very strong function of memory caching.

Example 1: My main gateway/router/firewall Ubuntu server, with magnetic disk:

Number of files: ~ 0.80 million.
Time to update database starting from flushed memory: 10.7 seconds
Time to update database starting from approximately fully cached directory stuff: 0.5 seconds.
Actual time for the cron daily mlocate task to run last night: 1 second.
Locate time (entire disk): 1.1 seconds
Find time (entire disk): 26.7 seconds

Example 2: My main Ubuntu test server, with magnetic disk:
Number of files: 4.3 million and 6.3 million, after adding 2 million for testing.

Time to update locate database starting from flushed memory:
  After adding 2 million files: 12 minutes 45 seconds.
  With no/minimal file changes: 10 minutes 50 seconds.

With no/minimal file changes and lots of cached directory stuff: 4.2 seconds.
Time to update database starting from lots of cached directory stuff:
  After adding 2 million files: fast (did not write it down)
  With no/minimal file changes: 4.2 seconds.
Actual time for the cron daily mlocate task to run last night: 6 seconds.
Locate time (entire disk): 2.7 seconds
Find time (entire disk, memory flushed): 17 minutes 42 seconds
Find time (entire disk, lots of cached directory stuff): 14.2 seconds.

For "energy usage": I think it is minimal and well worth it.

My main test server processor seems to take about 1 extra watt while
the locate database update task is running. (I do have a good mains
power data logger, but didn't hook it up for this.) So, for
last nights 6 second run time that is 6 Joules of extra processor
power. Guess at 3 times that power back at the mains for 18 Joules.
(even with no caching it's < 2000 Joules for the 11 minutes.)

... Doug



--
ubuntu-devel mailing list
ubuntu-devel@lists.ubuntu.com
Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/ubuntu-devel


--
Cláudio "Patola" Sampaio
MakerLinux Labs - Campinas, SP