Kernel Traffic #234 For 6Oct2003

There were 501 different contributors. 254 posted more than once. 193 posted last week too.

9Sep2003-18Sep2003 (48 posts) Subject: "experiences beyond 4 GB RAM with 2.4.22"

Stephan von Krawczyn reported that when he upgraded from 4 gigs of RAM to 6 under 2.4.22, he noticed his NFS clients starting to time out, general interactivity seemed terrible, and network performance dropped as well. He asked if this was just the way things were, and Andrea Arcangeli suggested he try kernel 2.4.22-aa1. Neil Brown also couldn't explain the problems, but suggested upgrading to one of the 2.6-test kernels, or at least configuring the kernel for a maximum of 4 gigs instead of 64, as Stephan had done originally. It wouldn't use all of Stephan's memory, Neil admitted, but it would be interesting if the system sped up again. Stephan said he'd actually already tried that, and found the system to work perfectly, even though it left 2 gigs of RAM unused.

Marcelo Tosatti also suggested trying Andrea's patches, as they contained significant changes to the Virtual Memory subsystem. Stephan tried this with 4 gigs of RAM, and again found no problems, though the problems with 6 gigs remained. At around this point, Alan Cox remarked, "The 2.6 tree is somewhat better about this but at the end of the day if your I/O subsystem can't do the job your box will not perform ideally. For some workloads its a huge win to have the extra RAM, for others the I/O is a real pain. Also in some cases it might be interesting to try using the extra RAM above the 4G boundary as a giant ram disk and using it as first swap device. I don't know anyone who explored that however."

In the course of discussion, Andrea Arcangeli made a post, and his sig included the following text:

/*
 * If you refuse to depend on closed software for a critical
 * part of your business, these links may be useful:
 *
 * rsync.kernel.org::pub/scm/linux/kernel/bkcvs/linux-2.5/
 * rsync.kernel.org::pub/scm/linux/kernel/bkcvs/linux-2.4/
 * http://www.cobite.com/cvsps/
 *
 * svn://svn.kernel.org/linux-2.6/trunk
 * svn://svn.kernel.org/linux-2.4/trunk
 */

We're providing the service which enables all of the below and without our good will that service is at risk. You're publicly slamming the providers of that service.

If you want to do that, that's your right, but that leads us to ask:

wasn't the deal that we do the gateway and you stop whining?
why should we provide this gateway service at all if there is whining?
why shouldn't we firewall you off since you are whining?
or if firewalling fails, put in a day or two delay in the gateway?

Pavel Machek replied, "Eh? You are providing a service and he provides you advertising. I can't see anything about slamming you. I guess that Andrea does not think about kernel developing as a business, so it is not really targeted at you. But anyway its *his* signature." And Larry said:

Other people may not agree with your view on this one Pavel. Most people read that signature and saw it as a negative comment about BitKeeper.

People have told me they believe that if BitMover isn't getting benefit from the free use of BK, free BK will go away. People don't want to have to depend on my goodwill, they want BitMover to derive benefit so that my goodwill doesn't matter, it's just smart business to give BK away.

If that is really what people here think then that means there has to be some benefit for BitMover. One of the benefits is that we get to say that the kernel team uses it and get some marketing advantage out of that. But that benefit diminishes if what people are saying is negative.

It makes sense that people are uneasy about depending on BK given the amount of work it takes to provide it and the amount of grief we take for providing it. I don't know why I didn't see that earlier, it's an unstable situation.

I know there are some people who will never be happy until everything is GPLed, I can't help those people other than provide the gateway. In return, those people need to stop whining, the gateway has to be enough.

For the rest of the people, I'm looking for suggestions on how to make this situation more stable. It took me a while but I can see why you are nervous, I'd be nervous in your position. I'm nervous about doing any real marketing of the kernel's use of BK because I figured it would lead to more flame wars. I'm starting to think that if we were doing that it might actually lead to less flames, based on the theory that we would then need you so you continue to get BK for free. If you have an opinion on that I'd like to know it.

10Sep2003-18Sep2003 (127 posts) Subject: "Update on AMD Athlon/Opteron/Athlon64 Prefetch Errata"

Continuing my yearly tradition of posting just one long novel to LKML every year, here is the literary update on the Prefetch Errata that the early 2.6 Kernels hit on AMD Athlon Processors.

This previously published errata can occur infrequently and is present in all AMD Athlon processors and earlier AMD Opteron/Athlon64 processors. See [1] and [2].

The full details are below, but the key point is that under certain circumstances, prefetch instructions can get memory management faults for addresses which would fault if they were accessed by a load or store instruction. We plan to revise our published errata with the new information below.

The errata requires a kernel workaround, but the good news is that it is:

Harmless in most cases where it could occur. Most of the time the prefetch will be targeting memory that is accessible under the current privilege mode. So the page will simply be "faulted in" slightly earlier than needed.
Rare and Infrequent. AMD Athlon processors have been available for years running numerous Operating Systems and only recently have we hit this errata outside of code specifically designed to target the errata -- requiring tens of thousands of iterations to cause it.
It can be worked around. Andi Kleen has a 2.6 and a 2.4 Kernel patches that we have tested at AMD on a large number of AMD Athlon processors and AMD Opteron/Athlon64 processors (both legacy x86 and x86-64 long mode). It works just fine. (Andi will be posting them soon when he wakes up ;-)
AMD is fixing this in future revisions of AMD Opteron/Athlon64 processors.
Andi's kernel patches will not be needed on future AMD processors but it is forward compatible and so won't break on them either.

The Details

Software prefetch instructions are defined to ignore page faults. Under highly specific and detailed internal circumstances, the following conditions may cause the PREFETCH instruction to report a page fault.

The target address of the PREFETCH would cause a page fault if the address was accessed by an actual memory load or store instruction under the current privilege mode.
The instruction is a PREFETCH or PREFETCHNTA/0/1/2 followed in execution-order by an actual or speculative byte-sized load to the same address.

In this case, the page fault exception error code bits for the faulting PREFETCH would be identical to that for a byte-sized load to the same address.
The instruction is a PREFETCHW followed in execution-order by an actual or speculative byte-sized store to the same address.

In this case, the page fault exception error code bits for the faulting PREFETCHW would be identical to that for a byte-sized store to the same address.

Note that some misaligned accesses can be broken up by the processor into multiple accesses where at least one of the accesses is a byte-sized access.

If the target address of the subsequent memory load or store is aligned and not byte-sized, this errata does not occur and no work-around is needed.

So the net effect is that an unexpected page fault may occur infrequently on a PREFETCH instruction.

Kernel Work-around

The kernel can work around the errata by modifying the Page Fault Handler in the following way. This is what Andi Kleen's patches do. Because the actual errata is infrequent it does not produce an excessive number of page faults that affect system performance.

Continue to allow the page fault handler to satisfy the page fault. If the faulting instruction is permitted access to the page, return to it as usual.
If the faulting instruction is not permitted access to the page, scan the instruction stream bytes at the faulting Instruction Pointer to determine if the instruction is a PREFETCH.
If it is not a PREFETCH instruction, generate the appropriate memory access control violation as appropriate.
If the faulting instruction is a PREFETCH instruction, simply return back to it; the internal hardware conditions that caused the PREFETCH to fault should be removed and operation should continue normally.

General Work-around

If the page-fault handler for a kernel can be patched as described above, no further action by software is required. The following general work-arounds should only be considered for kernels where the page-fault handler can not be patched and a PREFETCH instruction could end up targeting an address in an "inaccessible" page. (An "inaccessible" page is one for which memory accesses are not allowed under the current privilege mode.)

Because the actual errata is infrequent, it does not produce an excessive number of page faults that affect system performance. Therefore a page fault from a PREFETCH instruction for an address within an "accessible" page does not require any general work-around. (An "accessible" page is one for which memory accesses are allowed under the current privilege mode once the page is resident in memory)

Software can minimize the occurrence of the errata by issuing only one PREFETCH instruction per cache-line (a naturally-aligned 64-byte quantity on AMD Athlon and AMD Opteron/Athlon64) and ensuring one of the following:

In many cases, if a particular target address of a prefetch is known to encounter this errata, simply change the prefetch to target the next byte.
Avoid prefetching inaccessible memory locations, when possible.
In the general case, ensure that the address used by the PREFETCH is offset into the middle of an aligned quadword near the end of the cache-line. For example, if the address desired to be prefetched is "ADDR", use an offset of 0x33 to compute the address used by the actual PREFETCH instruction as: "(ADDR & ~0x3f) + 0x33"

Footnotes

[1] AMD Athlon(tm) Processor Model 6 Revision Guide 24332F June 2003.

www.amd.com/us-en/assets/content_type/white_papers_and_tech_docs/24332.pdf

[2] Revision Guide for AMD Opteron(tm) Processors 25759 Rev. 3.07 Aug 2003

www.amd.com/us-en/assets/content_type/white_papers_and_tech_docs/25759.PDF

Here is a patch to detect a prefect exception for 2.6.0test5/i386

I'm posting the versions for 2.4.22/x86-64 and 2.4.22/i386 in separate mail. 2.6/x86-64 is not done yet, but will be in my next merge.

The patches only change the slow path in the page fault handler - PREFETCH is only checked for when the kernel would send a signal anyways or print an oops. The check is also rather cheap so it is unconditionally enabled.

It handles SSE2 prefetches and 3dnow! style prefetches.

The only tricky bit is that it has to be careful to avoid recursive segfaults. The prefetch checker can cause another page fault when it does __get_user, but these should not recurse more than once.

This is insured by the placement of the checks in the page fault handler and only checking for prefetches if the fault came from user space or the exception tables have been already checked. To make this work without a per task exception nest counter I had to change the SIGBUS handling path slightly. Now when an get/put_user in kernel causes a SIGBUS it is not delivered to user space. Instead you just get the standard EFAULT back.

It also removed Andrew's old workaround for the problem.

In the course of discussion, Andi also added (in response to the suggestion that the code be made conditional, in case the errata was ever fixed):

My understanding is that it will never be fixed on the K7 (current Athlons), And these will be with us for a long time; more or less forever, just like the f00f bug :-)

I would agree with you if the patch had some bad impact on common paths, but I don't think that's the case here. It merely adds a cheap check to an already slow path.

On the other hand the errata is so unlikely that I doubt it will be frequently hit anyways. 2.6 kernel hitting it was just very very unlucky, and it only did so very infrequently.

Just to fix the kernel we could have chosen a different workaround, like adding an exception table entry to each prefetch and jumping back (I did that on the x86-64 port originally). But this way is also not that bad and it fixes hypothetical user space programs that could maybe hit it too.

Of course they may want to also fix it in a different way to run on older kernels (e.g. handling the signal in user space or avoiding the conditions). But doing it centrally in the kernel is a bit cleaner and at some point people have to update their kernels anyways.

Elsewhere, Linus Torvalds said critically, "This patch is fragile and looks pointless. What's wrong with the current status quo that just says "Athlon prefetch is broken"?" Andi pointed out that the status quo did not fix breakage in user space, and Dave Jones added that the status quo "cripples the earlier athlons which don't have this errata. Andi's fix at least makes prefetch work again on those boxes. It's also arguable that prefetch() helps the older K7's more than the affected ones." But Andi said that all Athlons suffered from this errata.

Chris Wright offered a short patch, explaining, "Here's a couple small tweaks. The first is to netpoll_setup. The settle time was too short for my e100, and the system would hang. The second is to netconsole so that it registers a console with CON_PRINTBUFFER. This helps debugging early bootup issues where you want to capture data from before netconsole is initialized. Perhaps it should be a param to netconsole?" Matt Mackall suggested that the netpoll_setup settle time adjustment should be made configurable at the command-line, instead of hardwired into the code. Chris agreed with this. Matt also said that it was fine for the netconsole change to not be conditional.

Jesse Barnes posted a driver for the SGI Altix serial console. The code was copyright SGI, released under the GPL, with Pat Gefre of SGI listed as the maintainer. Christoph Hellwig noticed that in the copyright notice, there was the following comment: "Further, this software is distributed without any warranty that it is free of the rightful claim of any third person regarding infringement or the like. Any license provided herein, whether implied or otherwise, applies only to this software file. Patent licenses, if any, provided herein do not apply to combinations of this program with other software, or any other product whatsoever." Christoph remarked, "This seems to be a restriction not compatible with the GPL. And it looks like many SGI files in the tree have it aswell.." Elsewhere, Alan Cox said the patch should be rejected until the licensing issues were sorted out. He said, "I am perfectly permitted by the GPL to modify Linux and add other code to it, or to create a new product based on it that is GPL licensed. SGI need to fix their boilerplate. I can kind of guess what they are trying to say but it needs tweaking." But Jess pointed out, "according to the FSF our extra clauses are compatible with the GPL and LGPL. See http://oss.sgi.com/projects/GenInfo/NoticeExplan/. If you still disagree then we'll have to try to find another solution." Alan took a look and said he'd have to investigate the legal issues further with the FSF lawyer. Jamie Lokier replied:

My reading of the boilerplate is that I can't use their code in another GPL program, because their patent grant doesn't extend to other programs.

Even if that's permitted by the GPL, it doesn't mean you have to accept it into the kernel like that.

If that's just a misunderstanding of the legalese on my part, then IMHO the text needs to be clarified to ensure that everyone knows they may take the code and use it in other GPL projects.

Alan said, "I'll let folks know when I hear back from the FSF folks or if SGI decide to tweak it anyway."

17Sep2003-23Sep2003 (17 posts) Subject: "[PATCH 2.6.x] additional kernel event notifications"

We request that the following patch for additional kernel event notifications be included in the upcoming 2.6.x kernel.

The current profiling hooks provide notifications at the end of a task's lifetime (i.e., task exit, mmap exit, and exec unmap). We would like to have additional notifications on the start of a task (i.e., fork, execve, kernel image loads, and user image loads).

We believe that profiling tools such as Oprofile, Perfmon, and VTune would benefit from the additional hooks by improving the accuracy and completeness of the performance data, especially when working in environments that can dynamically create and destroy executable code (such as Java). Furthermore, these hooks could be used to measure different types of performance data (e.g., "forks per second") which are currently not available any other way.

Our patch follows the conventions used by the current profiling hooks, and is relatively small.

We would appreciate comments/feedback on our proposal.

Jesse Barnes supported the patch, saying, "I for one would like to see it so that the performance monitoring tools can work properly without having to resort to syscall table patching." But Andrew Morton said that acceptance into the kernel would depend on licensing issues. He said, "If the code which needs the hooks is not in the kernel.org tree then people can patch the core kernel at the same time as adding the performance analysis patch. If the code which needs these hooks is not appropriately licensed then these hooks basically constitute a GPL bypass and that is not a direction we wish to be heading in." Juan replied:

Our sampling driver kernel module which uses these hooks is GPL and could be included in the kernel.org tree.

The current version of the driver (also GPL, but which hooks the sys_call_table for 2.4.x-based kernels) is posted at,

http://www.intel.com/software/products/opensource/vdk/

We plan to post our new driver for kernel 2.6.0-test5 (with the event notification patch applied) on both IA-32 and IA-64 to the above site early next week.

Jun Nakajima opined that he felt Intel was not trying to bypass the GPL, but was simply implementing functionality that was similar in scope to other code in the kernel.

That code seems to have a lot of infrastructure for buffering samples, transferring it to userspace, etc.

Have you looked into using the infrastructure in drivers/oprofile/ for this? In other words: is it possible to augment the kernel's existing oprofile capabilities so they meet VTune requirements?

Juan explained, "We are not trying to change the current profiling infrastructure. We are trying to enhance the existing event notification scheme to handle more events." He added, "The current event notifications used by tools like Oprofile, while quite useful, are not sufficient. The additional event notifications we propose can provide a more complete picture for performance tuning on Linux, particularly for dynamically generated code (such as found in Java). In addition to allowing for the enhancement of current performance tools, it also enables creation of new tools to gather measurements that were previously difficult to obtain (e.g., "image loads per second")."

Anton Blanchard suggested that Intel's work be layered on top of oprofile, to avoid duplication of effort; but Juan replied that "adds 4 generic hooks to the existing set of profiling hooks. These additional hooks can be used to help performance tools such as Oprofile and VTune to not mis-attribute performance data." He added, "We are open to the possibility of including the VTune driver into the base kernel, perhaps in an architecture-dependent area. It could complement existing profilers."

In the course of discussion, Larry McVoy complained again about Andrea Arcangeli's email sig, which advocated locations to access the kernel source tree without relying on the BitKeeper client. Larry said, "I can assure you that the first time the CVS gateway has a problem it won't come back until you have stopped being rude. You do understand that the SVN and RSYNC data come from the CVS gateway and that the CVS gateway comes from BitMover and that all of this crap is hosted by BitMover, right? {cvs,svn}.kernel.org are cnames for kernel.bkbits.net." Andrea replied, saying he didn't intend to be rude, he just had a different opinion from Larry. Andrea said, "I will never say that you're rude because your claims against open source you posted several times in linux-kernel (you know the parasite that eat the host, and lots and lots of stuff like that, all things that I absolutely and totally disagree with), I will never say the bitkeeper "free" licence is rude or whatever like that despite I find it much less acceptable than all other proprietary licence I dealt with in my limited experience with proprietary software, but people is different, it's not about being rude, it's about thinking differently, and I will never buy from you that thinking different is the same as being rude." Larry objected, "you are saying that closed source is bad, in particular, that BitKeeper is bad. That's not the problem, lots of people think that closed source is bad, but in the same breath you promote some free gateways PAID FOR BY BITKEEPER and requested by you. That's hypocritical in the extreme."

Chris Rivera said, "Robert Love and I would like to announce the release of a new procps utility, slabtop. Slabtop displays detail kernel slab layer information in real time. The look of slabtop matches top's. Slabtop displays a statistics header along with the 'top' caches based on a sort criteria." He gave a link to Robert Love's procps packages, and said he hoped folks would find the tool useful. A couple posts down the line, Robert added, "This is actually inspired (although not really based on) Martin Bligh's vmtop perl script. I just checked in a little "thanks" to him into the man page ;-)"

24Sep2003-25Sep2003 (14 posts) Subject: "rfc: test whether a device has a partition table"

As everyone knows it is a bad idea to let the kernel guess whether there is a partition table on a given block device, and if so, of what type. Nevertheless this is what almost everybody does.

Until now the philosophy was: floppies do not have a partition table, disks do have one, and for ZIP drives nobody knows. With USB we get more types of block device that may or may not have a partition table (and if they have none, usually there is a FAT filesystem with bootsector). In such cases the kernel assumes a partition table, and creates a mess if there was none. Some heuristics are needed.

Many checks are possible (for a DOS-type partition table: boot indicator must be 0 or 0x80, partitions are not larger than the disk, non-extended partitions are mutually disjoint; for a boot sector: it starts with a jump, the number of bytes per sector is 512 or at least a power of two, the number of sectors per cluster is 1 or at least a power of two, the number of reserved sectors is 1 or 32, the number of FAT copies is 2, ...).

I tried a minimal test, and the below is good enough for the boot sectors and DOS-type partition tables that I have here.

So, question: are there people with DOS-type partition tables or FAT fs bootsectors where the below gives the wrong answer? I would be interested in a copy of the sector.

I expect to submit some sanity check to DOS-type partition table parsing, and hope to recognize with high probability the presence of a full disk FAT filesystem.

So you say, and so you've said for a long time, but claiming that "everybody knows it" is clearly not true.

In particular, I think that a kernel that doesn't do partitioning is quite fundamentally broken. I'm sure others will agree.

If you have unusual cases (and let's face it, they don't much happen - we have traditionally had _very_ few problems with getting things partitioned) then you should be able to override them from user space and have user space be able to tell the kernel about special partitions.

And hey, surprise surprise, you can do exactly that.

Also, surprise surprise, pretty much nobody actually does it. Because the defaults are so sane.

Repeat after me: make the defaults so sane that most people don't even have to think about it.

In short, I think your first sentence (upon which the rest of the argument depends) is just quite _fundamentally_ flawed.

Andries protested that yes, he had been pointing out the theoretical problems for years; but that now the problems had moved into the practical realm, and needed to be dealt with. He said, "My post implicitly suggested the minimal thing to do. It will not be enough - heuristics are never enough - but it probably helps in most cases." But Alexander Viro protested, "If there *is* a partition table with one entry and it gets misparsed - we have a real bug that has to be dealt with and your heuristics won't help. If there is no partition table at all and in fact they have a filesystem on the entire disk - let them use *entire* *disk*. You can very well read /dev/sd<letter>, mount it, whatever." Andries replied:

First, if the kernel comes up with a bogus partition table, this will confuse users (and user space) greatly. It is not harmless, even though you would know how to survive.

Second, if the kernel reads random stuff from flash media that may yield I/O errors. Such media do often not have blocks at a fixed place, but have at the start a table that says where on the media a given block lives. Blocks that have never been written do not occur in the table, and attempts to read them give an I/O error. (And our famous SCSI error handling may want to retry a few times, reset the device and retry, reset the bus and retry .. I have seen boot times of a quarter of an hour because the kernel was busy retrying SmartMedia accesses.) In short - we should not read random blocks from a disk on flash media.

Elsewhere, Linus also thought the problem amounted to just a bug that needed fixing, rather than Andries' radical proposal. He added:

The _worst_ thing that can happen is that you have four extra (totally bogus) partitions, and you end up using the whole device.

That's my point about partitioning - not that it's necessarily perfect, but even when it _isn't_ perfect, it's no worse than not partitioning at all.

Letting mount or the kernel guess the type of the filesystem to mount is bad. If the kernel or mount guesses wrong the result can be fs corruption and kernel crash. So the right approach is to always give a -t option to mount and a rootfstype= boot option to the kernel.

But most people don't, and survive. And I maintain mount and over time a system of heuristics has been built into mount to make it rather likely that a guess will be correct.

The partition situation is similar but a bit worse. We have the second half: likely guesses, but we lack the first half: correctness with certainty.

What probably will happen as a result of this episode is that the likelihood of certain guesses is improved a bit. But I wouldnt mind the option of having certainty instead of probability. Userspace that tells the kernel, instead of letting the kernel probe.

Elsewhere, Linus took another look at Andries' patch, and made specific objections to various technical points. Andries responded to these, and the discussion then probably went to private email.

25Sep2003 (1 post) Subject: "[ANNOUNCE] DigSig 0.2: kernel module for digital signature verification for binaries"

DSI development team would like to announce the release 0.2 of digsig.

This kernel module (for 2.5.66 and higher) checks the signature of the binary before running it. The main goal is to insert digital signatures inside the ELF binary and verify this signature before loading the binary. It is based on the Linux Security Module hooks.

The code is GPL and available from: http://sourceforge.net/projects/disec/, download digsig-0.2.

I hope that it'll be useful to you. All bug reports and feature requests or general feedback are welcome (please CC me in your answer or feedback to the mailing list).

overview

Instead of writing a long detailed explication, I rather give you an example of how you can use it.

A Very simple scenraio to show how to use it

1) Generate gpg key and export your public key in order to use it for signature verification.

$gpg --gen-key

=> careful generate RSA key

$gpg --export >> my_public_key.pub

2) Sign your binaries using Bsign

Before using bsign to sign all your binaries, try out with a simple example.

$cp `which ps` ps-test
$bsign -s ps-test / sign the binary
$bsign -v ps-test / be sure that the signature is valid

3) Make the digsig module

From ./digsig, do make -C /usr/src/linux-2.5.66 SUBDIRS=$PWD modules. You need rw acess to /usr/src/linux-2.5.66.

CAREFULL: we advice you to compile the module in debug mode at your first tries (see -DDSI_DEBUG -DDSI_DIGSIG_DEBUG in the Makefile). In this mode, the module verifies the signatures but does not enforce the security (if not any signature present in your binary, you'll have a message in /var/log/messages but the execution is not aborted.). However, the execution of the bianaries with invalid signatures is aborted. Once, you're sure of your binary signature procedure you can recompile the whole on non-debug mode.

4) load digsig, use the public key exported in step 1 as argument

root@colby digsig-dev]# ./digsig.init start pubkey.pub
Loading Digsig module.
Making device for communication with the module.
Loading public key.
Done.
root@colby digsig-dev]#

5) In debug mode:

$./ps-test

$tail -f /var/log/messages
Sep 16 15:49:16 colby kernel: DSI-LSM MODULE - binary is ./ps-test
Sep 16 15:49:16 colby kernel: DSI-LSM MODULE - dsi_bprm_compute_creds:
Found signature section
Sep 16 15:49:16 colby kernel: DSI-LSM MODULE - dsi_bprm_compute_creds:
Signature verification successful

$ps

/ no check for not signed binaries
$tail -f /var/log/messages
Sep 16 15:49:16 colby kernel: DSI-LSM MODULE - binary is ./ps

6) In restrictive mode, normal mode

You need to use bsign to sign all binaries that you want to run in normal mode.

/ signed binary
[lmcmpou@reblochon lmcmpou]$ ps
 PID TTY          TIME CMD
6897 pts/2    00:00:00 bash
6941 pts/2    00:00:00 ps

/ not signed binary
[lmcmpou@reblochon lmcmpou]$ ./ps-makan-1
bash: ./ps-makan: cannot execute binary file

/ binary with wrong signature
[lmcmpou@reblochon lmcmpou]$ ./ps-makan-2
bash: ./ps-makan-colby: Operation not permitted

7) Unload the module.

[root@colby digsig-dev]# ./digsig.init stop
Unloading Digsig.

Performances

This is release 0.2. We have done some benchmarks.

We ran lmbench on a Pentium IV, 2.4 GHz, 500 mega bytes of memory, running Linux 2.5.66. Our benchmarks show that the execution time (exec function call) multiplies by a factor of 4 when the module is loaded (no changes for fork call, as the binary is not loaded into memory).

Some details

The module is independent from DSI (parent project) and you don't need to download the whole dsi tar ball to play with the digsig module (even if we'll be more than happy to have your feedback about dsi project :-)).

Our approach has been to use the existing solutions like gpg and bsign rather than reinventing the whole thing from scratch.

However, in order to reduce the overhead in the kernel, we took only the minimum code necessary from GPG. We took only the MPI (Multi Precision Integer) source code and the RSA crypto source code. This helped much to reduce the amount of code imported to the kernel in sourc code of the original (only 1/10 of the original gnupg 1.2.2 sourc code has been imported to the kernel module). On the other hand, we avoided OpenSSL source code for the fact that the licensing was not clear to us. We did some tests at user level and found out that OpenSSL is 4 times faster than GPG regarding RSA verification. As a future direction, we plan to clarify this licensing issue and use OpenSSL instead of GPG.

Requirements:

Linux OS kernel 2.5.66 or higher. We tested against 2.5.66 and 2.6.0-test5.

Bsign version 0.4.5. (http://packages.debian.org/unstable/admin/bsign.html)

GPG 1.2.2 or higher.

Merits

This work has been done by (alphabetical order)

A Apvrille ([email protected]),
D Gordon ([email protected]),
M Pourzandi ([email protected]),
V Roy ([email protected]).

Special merits go to David who wrote big chunks of the source code.

Thanks to Radu Filip ([email protected]) which has done the initial study for this work.

Thanks also to Marc Singer who helped us in using Bsign.


Kernel Traffic Latest\|Archives\|People\|Topics	Wine Latest\|Archives\|People\|Topics	GNUe Latest\|Archives\|People\|Topics
Czech

1.	9Sep2003-18Sep2003	(48 posts)	Status Of Large Memory Support
2.	9Sep2003-13Sep2003	(75 posts)	BitMover Asks Kernel Developers To Stop Complaining About BitKeeper
3.	10Sep2003-18Sep2003	(127 posts)	Athlon Prefetch Errata And Fix
4.	17Sep2003-18Sep2003	(10 posts)	Minor Tweaks To netpoll And netconsole
5.	17Sep2003-20Sep2003	(11 posts)	New SGI Altix Serial Console Driver; GPL Concerns With SGI-Contributed Code
6.	17Sep2003-23Sep2003	(17 posts)	Kernel Event Notification Code From Intel
7.	19Sep2003-23Sep2003	(30 posts)	More Threats From BitMover
8.	21Sep2003-22Sep2003	(11 posts)	New slabtop Utility To Track The Slab Layer Information In Real Time
9.	24Sep2003-25Sep2003	(14 posts)	Dealing With Partition Table Problems
10.	25Sep2003	(1 post)	New DigSig Module For Digital Signature Verification For Binaries

Kernel Traffic #234 For 6Oct2003

By Zack Brown