In the Linux kernel, the following vulnerability has been resolved:
tracing: Ensure visibility when inserting an element into tracing_map
Running the following two commands in parallel on a multi-processor AArch64 machine can sporadically produce an unexpected warning about duplicate histogram entries:
$ while true; do echo hist:key=id.syscall:val=hitcount > \ /sys/kernel/debug/tracing/events/rawsyscalls/sysenter/trigger cat /sys/kernel/debug/tracing/events/rawsyscalls/sysenter/hist sleep 0.001 done $ stress-ng --sysbadaddr $(nproc)
The warning looks as follows:
[ 2911.172474] ------------[ cut here ]------------ [ 2911.173111] Duplicates detected: 1 [ 2911.173574] WARNING: CPU: 2 PID: 12247 at kernel/trace/tracingmap.c:983 tracingmapsortentries+0x3e0/0x408 [ 2911.174702] Modules linked in: iscsiibft(E) iscsibootsysfs(E) rfkill(E) afpacket(E) nlsiso88591(E) nlscp437(E) vfat(E) fat(E) ena(E) tinypowerbutton(E) qemufwcfg(E) button(E) fuse(E) efipstore(E) iptables(E) xtables(E) xfs(E) libcrc32c(E) aesceblk(E) aescecipher(E) crct10difce(E) polyvalce(E) polyvalgeneric(E) ghashce(E) gf128mul(E) sm4cegcm(E) sm4ceccm(E) sm4ce(E) sm4cecipher(E) sm4(E) sm3ce(E) sm3(E) sha3ce(E) sha512ce(E) sha512arm64(E) sha2ce(E) sha256arm64(E) nvme(E) sha1ce(E) nvmecore(E) nvmeauth(E) t10pi(E) sg(E) scsimod(E) scsicommon(E) efivarfs(E) [ 2911.174738] Unloaded tainted modules: cppccpufreq(E):1 [ 2911.180985] CPU: 2 PID: 12247 Comm: cat Kdump: loaded Tainted: G E 6.7.0-default #2 1b58bbb22c97e4399dc09f92d309344f69c44a01 [ 2911.182398] Hardware name: Amazon EC2 c7g.8xlarge/, BIOS 1.0 11/1/2018 [ 2911.183208] pstate: 61400005 (nZCv daif +PAN -UAO -TCO +DIT -SSBS BTYPE=--) [ 2911.184038] pc : tracingmapsortentries+0x3e0/0x408 [ 2911.184667] lr : tracingmapsortentries+0x3e0/0x408 [ 2911.185310] sp : ffff8000a1513900 [ 2911.185750] x29: ffff8000a1513900 x28: ffff0003f272fe80 x27: 0000000000000001 [ 2911.186600] x26: ffff0003f272fe80 x25: 0000000000000030 x24: 0000000000000008 [ 2911.187458] x23: ffff0003c5788000 x22: ffff0003c16710c8 x21: ffff80008017f180 [ 2911.188310] x20: ffff80008017f000 x19: ffff80008017f180 x18: ffffffffffffffff [ 2911.189160] x17: 0000000000000000 x16: 0000000000000000 x15: ffff8000a15134b8 [ 2911.190015] x14: 0000000000000000 x13: 205d373432323154 x12: 5b5d313131333731 [ 2911.190844] x11: 00000000fffeffff x10: 00000000fffeffff x9 : ffffd1b78274a13c [ 2911.191716] x8 : 000000000017ffe8 x7 : c0000000fffeffff x6 : 000000000057ffa8 [ 2911.192554] x5 : ffff0012f6c24ec0 x4 : 0000000000000000 x3 : ffff2e5b72b5d000 [ 2911.193404] x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff0003ff254480 [ 2911.194259] Call trace: [ 2911.194626] tracingmapsortentries+0x3e0/0x408 [ 2911.195220] histshow+0x124/0x800 [ 2911.195692] seqreaditer+0x1d4/0x4e8 [ 2911.196193] seqread+0xe8/0x138 [ 2911.196638] vfsread+0xc8/0x300 [ 2911.197078] ksysread+0x70/0x108 [ 2911.197534] _arm64sysread+0x24/0x38 [ 2911.198046] invokesyscall+0x78/0x108 [ 2911.198553] el0svccommon.constprop.0+0xd0/0xf8 [ 2911.199157] doel0svc+0x28/0x40 [ 2911.199613] el0svc+0x40/0x178 [ 2911.200048] el0t64synchandler+0x13c/0x158 [ 2911.200621] el0t64_sync+0x1a8/0x1b0 [ 2911.201115] ---[ end trace 0000000000000000 ]---
The problem appears to be caused by CPU reordering of writes issued from _tracingmap_insert().
The check for the presence of an element with a given key in this function is:
val = READONCE(entry->val); if (val && keysmatch(key, val->key, map->key_size)) ...
The write of a new entry is:
elt = getfreeelt(map); memcpy(elt->key, key, map->key_size); entry->val = elt;
The "memcpy(elt->key, key, map->key_size);" and "entry->val = elt;" stores may become visible in the reversed order on another CPU. This second CPU might then incorrectly determine that a new key doesn't match an already present val->key and subse ---truncated---