Re: Canonical Livepatch broke CephFS client

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Aug 14, 2019 at 1:54 PM Tim Bishop <tim-lists@xxxxxxxxxxx> wrote:
>
> On Wed, Aug 14, 2019 at 12:44:15PM +0200, Ilya Dryomov wrote:
> > On Tue, Aug 13, 2019 at 10:56 PM Tim Bishop <tim-lists@xxxxxxxxxxx> wrote:
> > > This email is mostly a heads up for others who might be using
> > > Canonical's livepatch on Ubuntu on a CephFS client.
> > >
> > > I have an Ubuntu 18.04 client with the standard kernel currently at
> > > version linux-image-4.15.0-54-generic 4.15.0-54.58. CephFS is mounted
> > > with the kernel client. Cluster is running mimic 13.2.6. I've got
> > > livepatch running and this evening it did an update:
> > >
> > > Aug 13 17:33:55 myclient canonical-livepatch[2396]: Client.Check
> > > Aug 13 17:33:55 myclient canonical-livepatch[2396]: Checking with livepatch service.
> > > Aug 13 17:33:55 myclient canonical-livepatch[2396]: updating last-check
> > > Aug 13 17:33:55 myclient canonical-livepatch[2396]: touched last check
> > > Aug 13 17:33:56 myclient canonical-livepatch[2396]: Applying update 54.1 for 4.15.0-54.58-generic
> > > Aug 13 17:33:56 myclient kernel: [3700923.970750] PKCS#7 signature not signed with a trusted key
> > > Aug 13 17:33:59 myclient kernel: [3700927.069945] livepatch: enabling patch 'lkp_Ubuntu_4_15_0_54_58_generic_54'
> > > Aug 13 17:33:59 myclient kernel: [3700927.154956] livepatch: 'lkp_Ubuntu_4_15_0_54_58_generic_54': starting patching transition
> > > Aug 13 17:34:01 myclient kernel: [3700928.994487] livepatch: 'lkp_Ubuntu_4_15_0_54_58_generic_54': patching complete
> > > Aug 13 17:34:09 myclient canonical-livepatch[2396]: Applied patch version 54.1 to 4.15.0-54.58-generic
> > >
> > > And then immediately I saw:
> > >
> > > Aug 13 17:34:18 myclient kernel: [3700945.728684] libceph: mds0 1.2.3.4:6800 socket closed (con state OPEN)
> > > Aug 13 17:34:18 myclient kernel: [3700946.040138] libceph: mds0 1.2.3.4:6800 socket closed (con state OPEN)
> > > Aug 13 17:34:19 myclient kernel: [3700947.105692] libceph: mds0 1.2.3.4:6800 socket closed (con state OPEN)
> > > Aug 13 17:34:20 myclient kernel: [3700948.033704] libceph: mds0 1.2.3.4:6800 socket closed (con state OPEN)
> > >
> > > And on the MDS:
> > >
> > > 2019-08-13 17:34:18.286 7ff165e75700  0 SIGN: MSG 9241367 Message signature does not match contents.
> > > 2019-08-13 17:34:18.286 7ff165e75700  0 SIGN: MSG 9241367Signature on message:
> > > 2019-08-13 17:34:18.286 7ff165e75700  0 SIGN: MSG 9241367    sig: 10517606059379971075
> > > 2019-08-13 17:34:18.286 7ff165e75700  0 SIGN: MSG 9241367Locally calculated signature:
> > > 2019-08-13 17:34:18.286 7ff165e75700  0 SIGN: MSG 9241367 sig_check:4899837294009305543
> > > 2019-08-13 17:34:18.286 7ff165e75700  0 Signature failed.
> > > 2019-08-13 17:34:18.286 7ff165e75700  0 -- 1.2.3.4:6800/512468759 >> 4.3.2.1:0/928333509 conn(0xe6b9500 :6800 >> s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH pgs=2 cs=1 l=0).process >> Signature check failed
> > >
> > > Thankfully I was able to umount -f to unfreeze the client, but I have
> > > been unsuccessful remounting the file system using the kernel client.
> > > The fuse client worked fine as a workaround, but is slower.
> > >
> > > Taking a look at livepatch 54.1 I can see it touches Ceph code in the
> > > kernel:
> > >
> > > https://git.launchpad.net/~ubuntu-livepatch/+git/bionic-livepatches/commit/?id=3a3081c1e4c8e2e0f9f7a1ae4204eba5f38fbd29
> > >
> > > But the relevance of those changes isn't immediately clear to me. I
> > > expect after a reboot it'll be fine, but as yet untested.
> >
> > These changes are very relevant.  They introduce support for CEPHX_V2
> > protocol, where message signatures are computed slightly differently:
> > same algorithm but a different set of inputs.  The live-patched kernel
> > likely started signing using CEPHX_V2 without renegotiating.
>
> Ah - thanks for looking. Looks like something that wasn't a security
> issue so shouldn't have been included in the live patch.

Well, strictly speaking it is a security issue because the protocol was
rev'ed in response to two CVEs:

  https://nvd.nist.gov/vuln/detail/CVE-2018-1128
  https://nvd.nist.gov/vuln/detail/CVE-2018-1129

That said, it definitely doesn't qualify for live-patching, especially
when the resulting kernel image is not thoroughly tested.

>
> > This is a good example of how live-patching can go wrong.  A reboot
> > should definitely help.
>
> Yup, it certainly has its tradeoffs (not having to reboot so regularly
> is certainly a positive, though). I've replicated on a test machine and
> confirmed that a reboot does indeed fix the problem.

Thanks,

                Ilya
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux