pandora-kernel.git
7 years agosch_tbf: fix two null pointer dereferences on init failure
Nikolay Aleksandrov [Wed, 30 Aug 2017 09:49:05 +0000 (12:49 +0300)]
sch_tbf: fix two null pointer dereferences on init failure

commit c2d6511e6a4f1f3673d711569c00c3849549e9b0 upstream.

sch_tbf calls qdisc_watchdog_cancel() in both its ->reset and ->destroy
callbacks but it may fail before the timer is initialized due to missing
options (either not supplied by user-space or set as a default qdisc),
also q->qdisc is used by ->reset and ->destroy so we need it initialized.

Reproduce:
$ sysctl net.core.default_qdisc=tbf
$ ip l set ethX up

Crash log:
[  959.160172] BUG: unable to handle kernel NULL pointer dereference at 0000000000000018
[  959.160323] IP: qdisc_reset+0xa/0x5c
[  959.160400] PGD 59cdb067
[  959.160401] P4D 59cdb067
[  959.160466] PUD 59ccb067
[  959.160532] PMD 0
[  959.160597]
[  959.160706] Oops: 0000 [#1] SMP
[  959.160778] Modules linked in: sch_tbf sch_sfb sch_prio sch_netem
[  959.160891] CPU: 2 PID: 1562 Comm: ip Not tainted 4.13.0-rc6+ #62
[  959.160998] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.7.5-20140531_083030-gandalf 04/01/2014
[  959.161157] task: ffff880059c9a700 task.stack: ffff8800376d0000
[  959.161263] RIP: 0010:qdisc_reset+0xa/0x5c
[  959.161347] RSP: 0018:ffff8800376d3610 EFLAGS: 00010286
[  959.161531] RAX: ffffffffa001b1dd RBX: ffff8800373a2800 RCX: 0000000000000000
[  959.161733] RDX: ffffffff8215f160 RSI: ffffffff8215f160 RDI: 0000000000000000
[  959.161939] RBP: ffff8800376d3618 R08: 00000000014080c0 R09: 00000000ffffffff
[  959.162141] R10: ffff8800376d3578 R11: 0000000000000020 R12: ffffffffa001d2c0
[  959.162343] R13: ffff880037538000 R14: 00000000ffffffff R15: 0000000000000001
[  959.162546] FS:  00007fcc5126b740(0000) GS:ffff88005d900000(0000) knlGS:0000000000000000
[  959.162844] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  959.163030] CR2: 0000000000000018 CR3: 000000005abc4000 CR4: 00000000000406e0
[  959.163233] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  959.163436] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  959.163638] Call Trace:
[  959.163788]  tbf_reset+0x19/0x64 [sch_tbf]
[  959.163957]  qdisc_destroy+0x8b/0xe5
[  959.164119]  qdisc_create_dflt+0x86/0x94
[  959.164284]  ? dev_activate+0x129/0x129
[  959.164449]  attach_one_default_qdisc+0x36/0x63
[  959.164623]  netdev_for_each_tx_queue+0x3d/0x48
[  959.164795]  dev_activate+0x4b/0x129
[  959.164957]  __dev_open+0xe7/0x104
[  959.165118]  __dev_change_flags+0xc6/0x15c
[  959.165287]  dev_change_flags+0x25/0x59
[  959.165451]  do_setlink+0x30c/0xb3f
[  959.165613]  ? check_chain_key+0xb0/0xfd
[  959.165782]  rtnl_newlink+0x3a4/0x729
[  959.165947]  ? rtnl_newlink+0x117/0x729
[  959.166121]  ? ns_capable_common+0xd/0xb1
[  959.166288]  ? ns_capable+0x13/0x15
[  959.166450]  rtnetlink_rcv_msg+0x188/0x197
[  959.166617]  ? rcu_read_unlock+0x3e/0x5f
[  959.166783]  ? rtnl_newlink+0x729/0x729
[  959.166948]  netlink_rcv_skb+0x6c/0xce
[  959.167113]  rtnetlink_rcv+0x23/0x2a
[  959.167273]  netlink_unicast+0x103/0x181
[  959.167439]  netlink_sendmsg+0x326/0x337
[  959.167607]  sock_sendmsg_nosec+0x14/0x3f
[  959.167772]  sock_sendmsg+0x29/0x2e
[  959.167932]  ___sys_sendmsg+0x209/0x28b
[  959.168098]  ? do_raw_spin_unlock+0xcd/0xf8
[  959.168267]  ? _raw_spin_unlock+0x27/0x31
[  959.168432]  ? __handle_mm_fault+0x651/0xdb1
[  959.168602]  ? check_chain_key+0xb0/0xfd
[  959.168773]  __sys_sendmsg+0x45/0x63
[  959.168934]  ? __sys_sendmsg+0x45/0x63
[  959.169100]  SyS_sendmsg+0x19/0x1b
[  959.169260]  entry_SYSCALL_64_fastpath+0x23/0xc2
[  959.169432] RIP: 0033:0x7fcc5097e690
[  959.169592] RSP: 002b:00007ffd0d5c7b48 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
[  959.169887] RAX: ffffffffffffffda RBX: ffffffff810d278c RCX: 00007fcc5097e690
[  959.170089] RDX: 0000000000000000 RSI: 00007ffd0d5c7b90 RDI: 0000000000000003
[  959.170292] RBP: ffff8800376d3f98 R08: 0000000000000001 R09: 0000000000000003
[  959.170494] R10: 00007ffd0d5c7910 R11: 0000000000000246 R12: 0000000000000006
[  959.170697] R13: 000000000066f1a0 R14: 00007ffd0d5cfc40 R15: 0000000000000000
[  959.170900]  ? trace_hardirqs_off_caller+0xa7/0xcf
[  959.171076] Code: 00 41 c7 84 24 14 01 00 00 00 00 00 00 41 c7 84 24
98 00 00 00 00 00 00 00 41 5c 41 5d 41 5e 5d c3 66 66 66 66 90 55 48 89
e5 53 <48> 8b 47 18 48 89 fb 48 8b 40 48 48 85 c0 74 02 ff d0 48 8b bb
[  959.171637] RIP: qdisc_reset+0xa/0x5c RSP: ffff8800376d3610
[  959.171821] CR2: 0000000000000018

Fixes: 87b60cfacf9f ("net_sched: fix error recovery at qdisc creation")
Fixes: 0fbbeb1ba43b ("[PKT_SCHED]: Fix missing qdisc_destroy() in qdisc_create_dflt()")
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
7 years agosch_netem: avoid null pointer deref on init failure
Nikolay Aleksandrov [Wed, 30 Aug 2017 09:49:03 +0000 (12:49 +0300)]
sch_netem: avoid null pointer deref on init failure

commit 634576a1844dba15bc5e6fc61d72f37e13a21615 upstream.

netem can fail in ->init due to missing options (either not supplied by
user-space or used as a default qdisc) causing a timer->base null
pointer deref in its ->destroy() and ->reset() callbacks.

Reproduce:
$ sysctl net.core.default_qdisc=netem
$ ip l set ethX up

Crash log:
[ 1814.846943] BUG: unable to handle kernel NULL pointer dereference at (null)
[ 1814.847181] IP: hrtimer_active+0x17/0x8a
[ 1814.847270] PGD 59c34067
[ 1814.847271] P4D 59c34067
[ 1814.847337] PUD 37374067
[ 1814.847403] PMD 0
[ 1814.847468]
[ 1814.847582] Oops: 0000 [#1] SMP
[ 1814.847655] Modules linked in: sch_netem(O) sch_fq_codel(O)
[ 1814.847761] CPU: 3 PID: 1573 Comm: ip Tainted: G           O 4.13.0-rc6+ #62
[ 1814.847884] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.7.5-20140531_083030-gandalf 04/01/2014
[ 1814.848043] task: ffff88003723a700 task.stack: ffff88005adc8000
[ 1814.848235] RIP: 0010:hrtimer_active+0x17/0x8a
[ 1814.848407] RSP: 0018:ffff88005adcb590 EFLAGS: 00010246
[ 1814.848590] RAX: 0000000000000000 RBX: ffff880058e359d8 RCX: 0000000000000000
[ 1814.848793] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff880058e359d8
[ 1814.848998] RBP: ffff88005adcb5b0 R08: 00000000014080c0 R09: 00000000ffffffff
[ 1814.849204] R10: ffff88005adcb660 R11: 0000000000000020 R12: 0000000000000000
[ 1814.849410] R13: ffff880058e359d8 R14: 00000000ffffffff R15: 0000000000000001
[ 1814.849616] FS:  00007f733bbca740(0000) GS:ffff88005d980000(0000) knlGS:0000000000000000
[ 1814.849919] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1814.850107] CR2: 0000000000000000 CR3: 0000000059f0d000 CR4: 00000000000406e0
[ 1814.850313] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 1814.850518] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 1814.850723] Call Trace:
[ 1814.850875]  hrtimer_try_to_cancel+0x1a/0x93
[ 1814.851047]  hrtimer_cancel+0x15/0x20
[ 1814.851211]  qdisc_watchdog_cancel+0x12/0x14
[ 1814.851383]  netem_reset+0xe6/0xed [sch_netem]
[ 1814.851561]  qdisc_destroy+0x8b/0xe5
[ 1814.851723]  qdisc_create_dflt+0x86/0x94
[ 1814.851890]  ? dev_activate+0x129/0x129
[ 1814.852057]  attach_one_default_qdisc+0x36/0x63
[ 1814.852232]  netdev_for_each_tx_queue+0x3d/0x48
[ 1814.852406]  dev_activate+0x4b/0x129
[ 1814.852569]  __dev_open+0xe7/0x104
[ 1814.852730]  __dev_change_flags+0xc6/0x15c
[ 1814.852899]  dev_change_flags+0x25/0x59
[ 1814.853064]  do_setlink+0x30c/0xb3f
[ 1814.853228]  ? check_chain_key+0xb0/0xfd
[ 1814.853396]  ? check_chain_key+0xb0/0xfd
[ 1814.853565]  rtnl_newlink+0x3a4/0x729
[ 1814.853728]  ? rtnl_newlink+0x117/0x729
[ 1814.853905]  ? ns_capable_common+0xd/0xb1
[ 1814.854072]  ? ns_capable+0x13/0x15
[ 1814.854234]  rtnetlink_rcv_msg+0x188/0x197
[ 1814.854404]  ? rcu_read_unlock+0x3e/0x5f
[ 1814.854572]  ? rtnl_newlink+0x729/0x729
[ 1814.854737]  netlink_rcv_skb+0x6c/0xce
[ 1814.854902]  rtnetlink_rcv+0x23/0x2a
[ 1814.855064]  netlink_unicast+0x103/0x181
[ 1814.855230]  netlink_sendmsg+0x326/0x337
[ 1814.855398]  sock_sendmsg_nosec+0x14/0x3f
[ 1814.855584]  sock_sendmsg+0x29/0x2e
[ 1814.855747]  ___sys_sendmsg+0x209/0x28b
[ 1814.855912]  ? do_raw_spin_unlock+0xcd/0xf8
[ 1814.856082]  ? _raw_spin_unlock+0x27/0x31
[ 1814.856251]  ? __handle_mm_fault+0x651/0xdb1
[ 1814.856421]  ? check_chain_key+0xb0/0xfd
[ 1814.856592]  __sys_sendmsg+0x45/0x63
[ 1814.856755]  ? __sys_sendmsg+0x45/0x63
[ 1814.856923]  SyS_sendmsg+0x19/0x1b
[ 1814.857083]  entry_SYSCALL_64_fastpath+0x23/0xc2
[ 1814.857256] RIP: 0033:0x7f733b2dd690
[ 1814.857419] RSP: 002b:00007ffe1d3387d8 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
[ 1814.858238] RAX: ffffffffffffffda RBX: ffffffff810d278c RCX: 00007f733b2dd690
[ 1814.858445] RDX: 0000000000000000 RSI: 00007ffe1d338820 RDI: 0000000000000003
[ 1814.858651] RBP: ffff88005adcbf98 R08: 0000000000000001 R09: 0000000000000003
[ 1814.858856] R10: 00007ffe1d3385a0 R11: 0000000000000246 R12: 0000000000000002
[ 1814.859060] R13: 000000000066f1a0 R14: 00007ffe1d3408d0 R15: 0000000000000000
[ 1814.859267]  ? trace_hardirqs_off_caller+0xa7/0xcf
[ 1814.859446] Code: 10 55 48 89 c7 48 89 e5 e8 45 a1 fb ff 31 c0 5d c3
31 c0 c3 66 66 66 66 90 55 48 89 e5 41 56 41 55 41 54 53 49 89 fd 49 8b
45 30 <4c> 8b 20 41 8b 5c 24 38 31 c9 31 d2 48 c7 c7 50 8e 1d 82 41 89
[ 1814.860022] RIP: hrtimer_active+0x17/0x8a RSP: ffff88005adcb590
[ 1814.860214] CR2: 0000000000000000

Fixes: 87b60cfacf9f ("net_sched: fix error recovery at qdisc creation")
Fixes: 0fbbeb1ba43b ("[PKT_SCHED]: Fix missing qdisc_destroy() in qdisc_create_dflt()")
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
7 years agosch_cbq: fix null pointer dereferences on init failure
Nikolay Aleksandrov [Wed, 30 Aug 2017 09:49:01 +0000 (12:49 +0300)]
sch_cbq: fix null pointer dereferences on init failure

commit 3501d059921246ff617b43e86250a719c140bd97 upstream.

CBQ can fail on ->init by wrong nl attributes or simply for missing any,
f.e. if it's set as a default qdisc then TCA_OPTIONS (opt) will be NULL
when it is activated. The first thing init does is parse opt but it will
dereference a null pointer if used as a default qdisc, also since init
failure at default qdisc invokes ->reset() which cancels all timers then
we'll also dereference two more null pointers (timer->base) as they were
never initialized.

To reproduce:
$ sysctl net.core.default_qdisc=cbq
$ ip l set ethX up

Crash log of the first null ptr deref:
[44727.907454] BUG: unable to handle kernel NULL pointer dereference at (null)
[44727.907600] IP: cbq_init+0x27/0x205
[44727.907676] PGD 59ff4067
[44727.907677] P4D 59ff4067
[44727.907742] PUD 59c70067
[44727.907807] PMD 0
[44727.907873]
[44727.907982] Oops: 0000 [#1] SMP
[44727.908054] Modules linked in:
[44727.908126] CPU: 1 PID: 21312 Comm: ip Not tainted 4.13.0-rc6+ #60
[44727.908235] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.7.5-20140531_083030-gandalf 04/01/2014
[44727.908477] task: ffff88005ad42700 task.stack: ffff880037214000
[44727.908672] RIP: 0010:cbq_init+0x27/0x205
[44727.908838] RSP: 0018:ffff8800372175f0 EFLAGS: 00010286
[44727.909018] RAX: ffffffff816c3852 RBX: ffff880058c53800 RCX: 0000000000000000
[44727.909222] RDX: 0000000000000004 RSI: 0000000000000000 RDI: ffff8800372175f8
[44727.909427] RBP: ffff880037217650 R08: ffffffff81b0f380 R09: 0000000000000000
[44727.909631] R10: ffff880037217660 R11: 0000000000000020 R12: ffffffff822a44c0
[44727.909835] R13: ffff880058b92000 R14: 00000000ffffffff R15: 0000000000000001
[44727.910040] FS:  00007ff8bc583740(0000) GS:ffff88005d880000(0000) knlGS:0000000000000000
[44727.910339] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[44727.910525] CR2: 0000000000000000 CR3: 00000000371e5000 CR4: 00000000000406e0
[44727.910731] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[44727.910936] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[44727.911141] Call Trace:
[44727.911291]  ? lockdep_init_map+0xb6/0x1ba
[44727.911461]  ? qdisc_alloc+0x14e/0x187
[44727.911626]  qdisc_create_dflt+0x7a/0x94
[44727.911794]  ? dev_activate+0x129/0x129
[44727.911959]  attach_one_default_qdisc+0x36/0x63
[44727.912132]  netdev_for_each_tx_queue+0x3d/0x48
[44727.912305]  dev_activate+0x4b/0x129
[44727.912468]  __dev_open+0xe7/0x104
[44727.912631]  __dev_change_flags+0xc6/0x15c
[44727.912799]  dev_change_flags+0x25/0x59
[44727.912966]  do_setlink+0x30c/0xb3f
[44727.913129]  ? check_chain_key+0xb0/0xfd
[44727.913294]  ? check_chain_key+0xb0/0xfd
[44727.913463]  rtnl_newlink+0x3a4/0x729
[44727.913626]  ? rtnl_newlink+0x117/0x729
[44727.913801]  ? ns_capable_common+0xd/0xb1
[44727.913968]  ? ns_capable+0x13/0x15
[44727.914131]  rtnetlink_rcv_msg+0x188/0x197
[44727.914300]  ? rcu_read_unlock+0x3e/0x5f
[44727.914465]  ? rtnl_newlink+0x729/0x729
[44727.914630]  netlink_rcv_skb+0x6c/0xce
[44727.914796]  rtnetlink_rcv+0x23/0x2a
[44727.914956]  netlink_unicast+0x103/0x181
[44727.915122]  netlink_sendmsg+0x326/0x337
[44727.915291]  sock_sendmsg_nosec+0x14/0x3f
[44727.915459]  sock_sendmsg+0x29/0x2e
[44727.915619]  ___sys_sendmsg+0x209/0x28b
[44727.915784]  ? do_raw_spin_unlock+0xcd/0xf8
[44727.915954]  ? _raw_spin_unlock+0x27/0x31
[44727.916121]  ? __handle_mm_fault+0x651/0xdb1
[44727.916290]  ? check_chain_key+0xb0/0xfd
[44727.916461]  __sys_sendmsg+0x45/0x63
[44727.916626]  ? __sys_sendmsg+0x45/0x63
[44727.916792]  SyS_sendmsg+0x19/0x1b
[44727.916950]  entry_SYSCALL_64_fastpath+0x23/0xc2
[44727.917125] RIP: 0033:0x7ff8bbc96690
[44727.917286] RSP: 002b:00007ffc360991e8 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
[44727.917579] RAX: ffffffffffffffda RBX: ffffffff810d278c RCX: 00007ff8bbc96690
[44727.917783] RDX: 0000000000000000 RSI: 00007ffc36099230 RDI: 0000000000000003
[44727.917987] RBP: ffff880037217f98 R08: 0000000000000001 R09: 0000000000000003
[44727.918190] R10: 00007ffc36098fb0 R11: 0000000000000246 R12: 0000000000000006
[44727.918393] R13: 000000000066f1a0 R14: 00007ffc360a12e0 R15: 0000000000000000
[44727.918597]  ? trace_hardirqs_off_caller+0xa7/0xcf
[44727.918774] Code: 41 5f 5d c3 66 66 66 66 90 55 48 8d 56 04 45 31 c9
49 c7 c0 80 f3 b0 81 48 89 e5 41 55 41 54 53 48 89 fb 48 8d 7d a8 48 83
ec 48 <0f> b7 0e be 07 00 00 00 83 e9 04 e8 e6 f7 d8 ff 85 c0 0f 88 bb
[44727.919332] RIP: cbq_init+0x27/0x205 RSP: ffff8800372175f0
[44727.919516] CR2: 0000000000000000

Fixes: 0fbbeb1ba43b ("[PKT_SCHED]: Fix missing qdisc_destroy() in qdisc_create_dflt()")
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
[bwh: Backported to 3.2:
 - Keep using HRTIMER_MODE_ABS
 - Adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
7 years agosch_hfsc: fix null pointer deref and double free on init failure
Nikolay Aleksandrov [Wed, 30 Aug 2017 09:49:00 +0000 (12:49 +0300)]
sch_hfsc: fix null pointer deref and double free on init failure

commit 3bdac362a2f89ed3e148fa6f38c5f5d858f50b1a upstream.

Depending on where ->init fails we can get a null pointer deref due to
uninitialized hires timer (watchdog) or a double free of the qdisc hash
because it is already freed by ->destroy().

Fixes: 8d5537387505 ("net/sched/hfsc: allocate tcf block for hfsc root class")
Fixes: 87b60cfacf9f ("net_sched: fix error recovery at qdisc creation")
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
[bwh: Backported to 3.2: sch_hfsc doesn't use a tcf block]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
7 years agosch_multiq: fix double free on init failure
Nikolay Aleksandrov [Wed, 30 Aug 2017 09:48:58 +0000 (12:48 +0300)]
sch_multiq: fix double free on init failure

commit e89d469e3be3ed3d7124a803211a463ff83d0964 upstream.

The below commit added a call to ->destroy() on init failure, but multiq
still frees ->queues on error in init, but ->queues is also freed by
->destroy() thus we get double free and corrupted memory.

Very easy to reproduce (eth0 not multiqueue):
$ tc qdisc add dev eth0 root multiq
RTNETLINK answers: Operation not supported
$ ip l add dumdum type dummy
(crash)

Trace log:
[ 3929.467747] general protection fault: 0000 [#1] SMP
[ 3929.468083] Modules linked in:
[ 3929.468302] CPU: 3 PID: 967 Comm: ip Not tainted 4.13.0-rc6+ #56
[ 3929.468625] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.7.5-20140531_083030-gandalf 04/01/2014
[ 3929.469124] task: ffff88003716a700 task.stack: ffff88005872c000
[ 3929.469449] RIP: 0010:__kmalloc_track_caller+0x117/0x1be
[ 3929.469746] RSP: 0018:ffff88005872f6a0 EFLAGS: 00010246
[ 3929.470042] RAX: 00000000000002de RBX: 0000000058a59000 RCX: 00000000000002df
[ 3929.470406] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffffff821f7020
[ 3929.470770] RBP: ffff88005872f6e8 R08: 000000000001f010 R09: 0000000000000000
[ 3929.471133] R10: ffff88005872f730 R11: 0000000000008cdd R12: ff006d75646d7564
[ 3929.471496] R13: 00000000014000c0 R14: ffff88005b403c00 R15: ffff88005b403c00
[ 3929.471869] FS:  00007f0b70480740(0000) GS:ffff88005d980000(0000) knlGS:0000000000000000
[ 3929.472286] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 3929.472677] CR2: 00007ffcee4f3000 CR3: 0000000059d45000 CR4: 00000000000406e0
[ 3929.473209] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 3929.474109] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 3929.474873] Call Trace:
[ 3929.475337]  ? kstrdup_const+0x23/0x25
[ 3929.475863]  kstrdup+0x2e/0x4b
[ 3929.476338]  kstrdup_const+0x23/0x25
[ 3929.478084]  __kernfs_new_node+0x28/0xbc
[ 3929.478478]  kernfs_new_node+0x35/0x55
[ 3929.478929]  kernfs_create_link+0x23/0x76
[ 3929.479478]  sysfs_do_create_link_sd.isra.2+0x85/0xd7
[ 3929.480096]  sysfs_create_link+0x33/0x35
[ 3929.480649]  device_add+0x200/0x589
[ 3929.481184]  netdev_register_kobject+0x7c/0x12f
[ 3929.481711]  register_netdevice+0x373/0x471
[ 3929.482174]  rtnl_newlink+0x614/0x729
[ 3929.482610]  ? rtnl_newlink+0x17f/0x729
[ 3929.483080]  rtnetlink_rcv_msg+0x188/0x197
[ 3929.483533]  ? rcu_read_unlock+0x3e/0x5f
[ 3929.483984]  ? rtnl_newlink+0x729/0x729
[ 3929.484420]  netlink_rcv_skb+0x6c/0xce
[ 3929.484858]  rtnetlink_rcv+0x23/0x2a
[ 3929.485291]  netlink_unicast+0x103/0x181
[ 3929.485735]  netlink_sendmsg+0x326/0x337
[ 3929.486181]  sock_sendmsg_nosec+0x14/0x3f
[ 3929.486614]  sock_sendmsg+0x29/0x2e
[ 3929.486973]  ___sys_sendmsg+0x209/0x28b
[ 3929.487340]  ? do_raw_spin_unlock+0xcd/0xf8
[ 3929.487719]  ? _raw_spin_unlock+0x27/0x31
[ 3929.488092]  ? __handle_mm_fault+0x651/0xdb1
[ 3929.488471]  ? check_chain_key+0xb0/0xfd
[ 3929.488847]  __sys_sendmsg+0x45/0x63
[ 3929.489206]  ? __sys_sendmsg+0x45/0x63
[ 3929.489576]  SyS_sendmsg+0x19/0x1b
[ 3929.489901]  entry_SYSCALL_64_fastpath+0x23/0xc2
[ 3929.490172] RIP: 0033:0x7f0b6fb93690
[ 3929.490423] RSP: 002b:00007ffcee4ed588 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
[ 3929.490881] RAX: ffffffffffffffda RBX: ffffffff810d278c RCX: 00007f0b6fb93690
[ 3929.491198] RDX: 0000000000000000 RSI: 00007ffcee4ed5d0 RDI: 0000000000000003
[ 3929.491521] RBP: ffff88005872ff98 R08: 0000000000000001 R09: 0000000000000000
[ 3929.491801] R10: 00007ffcee4ed350 R11: 0000000000000246 R12: 0000000000000002
[ 3929.492075] R13: 000000000066f1a0 R14: 00007ffcee4f5680 R15: 0000000000000000
[ 3929.492352]  ? trace_hardirqs_off_caller+0xa7/0xcf
[ 3929.492590] Code: 8b 45 c0 48 8b 45 b8 74 17 48 8b 4d c8 83 ca ff 44
89 ee 4c 89 f7 e8 83 ca ff ff 49 89 c4 eb 49 49 63 56 20 48 8d 48 01 4d
8b 06 <49> 8b 1c 14 48 89 c2 4c 89 e0 65 49 0f c7 08 0f 94 c0 83 f0 01
[ 3929.493335] RIP: __kmalloc_track_caller+0x117/0x1be RSP: ffff88005872f6a0

Fixes: 87b60cfacf9f ("net_sched: fix error recovery at qdisc creation")
Fixes: f07d1501292b ("multiq: Further multiqueue cleanup")
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
[bwh: Backported to 3.2: delete now-unused 'err' variable]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
7 years agosch_htb: fix crash on init failure
Nikolay Aleksandrov [Wed, 30 Aug 2017 09:48:57 +0000 (12:48 +0300)]
sch_htb: fix crash on init failure

commit 88c2ace69dbef696edba77712882af03879abc9c upstream.

The commit below added a call to the ->destroy() callback for all qdiscs
which failed in their ->init(), but some were not prepared for such
change and can't handle partially initialized qdisc. HTB is one of them
and if any error occurs before the qdisc watchdog timer and qdisc work are
initialized then we can hit either a null ptr deref (timer->base) when
canceling in ->destroy or lockdep error info about trying to register
a non-static key and a stack dump. So to fix these two move the watchdog
timer and workqueue init before anything that can err out.
To reproduce userspace needs to send broken htb qdisc create request,
tested with a modified tc (q_htb.c).

Trace log:
[ 2710.897602] BUG: unable to handle kernel NULL pointer dereference at (null)
[ 2710.897977] IP: hrtimer_active+0x17/0x8a
[ 2710.898174] PGD 58fab067
[ 2710.898175] P4D 58fab067
[ 2710.898353] PUD 586c0067
[ 2710.898531] PMD 0
[ 2710.898710]
[ 2710.899045] Oops: 0000 [#1] SMP
[ 2710.899232] Modules linked in:
[ 2710.899419] CPU: 1 PID: 950 Comm: tc Not tainted 4.13.0-rc6+ #54
[ 2710.899646] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.7.5-20140531_083030-gandalf 04/01/2014
[ 2710.900035] task: ffff880059ed2700 task.stack: ffff88005ad4c000
[ 2710.900262] RIP: 0010:hrtimer_active+0x17/0x8a
[ 2710.900467] RSP: 0018:ffff88005ad4f960 EFLAGS: 00010246
[ 2710.900684] RAX: 0000000000000000 RBX: ffff88003701e298 RCX: 0000000000000000
[ 2710.900933] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff88003701e298
[ 2710.901177] RBP: ffff88005ad4f980 R08: 0000000000000001 R09: 0000000000000001
[ 2710.901419] R10: ffff88005ad4f800 R11: 0000000000000400 R12: 0000000000000000
[ 2710.901663] R13: ffff88003701e298 R14: ffffffff822a4540 R15: ffff88005ad4fac0
[ 2710.901907] FS:  00007f2f5e90f740(0000) GS:ffff88005d880000(0000) knlGS:0000000000000000
[ 2710.902277] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 2710.902500] CR2: 0000000000000000 CR3: 0000000058ca3000 CR4: 00000000000406e0
[ 2710.902744] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 2710.902977] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 2710.903180] Call Trace:
[ 2710.903332]  hrtimer_try_to_cancel+0x1a/0x93
[ 2710.903504]  hrtimer_cancel+0x15/0x20
[ 2710.903667]  qdisc_watchdog_cancel+0x12/0x14
[ 2710.903866]  htb_destroy+0x2e/0xf7
[ 2710.904097]  qdisc_create+0x377/0x3fd
[ 2710.904330]  tc_modify_qdisc+0x4d2/0x4fd
[ 2710.904511]  rtnetlink_rcv_msg+0x188/0x197
[ 2710.904682]  ? rcu_read_unlock+0x3e/0x5f
[ 2710.904849]  ? rtnl_newlink+0x729/0x729
[ 2710.905017]  netlink_rcv_skb+0x6c/0xce
[ 2710.905183]  rtnetlink_rcv+0x23/0x2a
[ 2710.905345]  netlink_unicast+0x103/0x181
[ 2710.905511]  netlink_sendmsg+0x326/0x337
[ 2710.905679]  sock_sendmsg_nosec+0x14/0x3f
[ 2710.905847]  sock_sendmsg+0x29/0x2e
[ 2710.906010]  ___sys_sendmsg+0x209/0x28b
[ 2710.906176]  ? do_raw_spin_unlock+0xcd/0xf8
[ 2710.906346]  ? _raw_spin_unlock+0x27/0x31
[ 2710.906514]  ? __handle_mm_fault+0x651/0xdb1
[ 2710.906685]  ? check_chain_key+0xb0/0xfd
[ 2710.906855]  __sys_sendmsg+0x45/0x63
[ 2710.907018]  ? __sys_sendmsg+0x45/0x63
[ 2710.907185]  SyS_sendmsg+0x19/0x1b
[ 2710.907344]  entry_SYSCALL_64_fastpath+0x23/0xc2

Note that probably this bug goes further back because the default qdisc
handling always calls ->destroy on init failure too.

Fixes: 87b60cfacf9f ("net_sched: fix error recovery at qdisc creation")
Fixes: 0fbbeb1ba43b ("[PKT_SCHED]: Fix missing qdisc_destroy() in qdisc_create_dflt()")
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
7 years agonet_sched: fix error recovery at qdisc creation
Eric Dumazet [Fri, 10 Feb 2017 18:31:49 +0000 (10:31 -0800)]
net_sched: fix error recovery at qdisc creation

commit 87b60cfacf9f17cf71933c6e33b66e68160af71d upstream.

Dmitry reported uses after free in qdisc code [1]

The problem here is that ops->init() can return an error.

qdisc_create_dflt() then call ops->destroy(),
while qdisc_create() does _not_ call it.

Four qdisc chose to call their own ops->destroy(), assuming their caller
would not.

This patch makes sure qdisc_create() calls ops->destroy()
and fixes the four qdisc to avoid double free.

[1]
BUG: KASAN: use-after-free in mq_destroy+0x242/0x290 net/sched/sch_mq.c:33 at addr ffff8801d415d440
Read of size 8 by task syz-executor2/5030
CPU: 0 PID: 5030 Comm: syz-executor2 Not tainted 4.3.5-smp-DEV #119
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
 0000000000000046 ffff8801b435b870 ffffffff81bbbed4 ffff8801db000400
 ffff8801d415d440 ffff8801d415dc40 ffff8801c4988510 ffff8801b435b898
 ffffffff816682b1 ffff8801b435b928 ffff8801d415d440 ffff8801c49880c0
Call Trace:
 [<ffffffff81bbbed4>] __dump_stack lib/dump_stack.c:15 [inline]
 [<ffffffff81bbbed4>] dump_stack+0x6c/0x98 lib/dump_stack.c:51
 [<ffffffff816682b1>] kasan_object_err+0x21/0x70 mm/kasan/report.c:158
 [<ffffffff81668524>] print_address_description mm/kasan/report.c:196 [inline]
 [<ffffffff81668524>] kasan_report_error+0x1b4/0x4b0 mm/kasan/report.c:285
 [<ffffffff81668953>] kasan_report mm/kasan/report.c:305 [inline]
 [<ffffffff81668953>] __asan_report_load8_noabort+0x43/0x50 mm/kasan/report.c:326
 [<ffffffff82527b02>] mq_destroy+0x242/0x290 net/sched/sch_mq.c:33
 [<ffffffff82524bdd>] qdisc_destroy+0x12d/0x290 net/sched/sch_generic.c:953
 [<ffffffff82524e30>] qdisc_create_dflt+0xf0/0x120 net/sched/sch_generic.c:848
 [<ffffffff8252550d>] attach_default_qdiscs net/sched/sch_generic.c:1029 [inline]
 [<ffffffff8252550d>] dev_activate+0x6ad/0x880 net/sched/sch_generic.c:1064
 [<ffffffff824b1db1>] __dev_open+0x221/0x320 net/core/dev.c:1403
 [<ffffffff824b24ce>] __dev_change_flags+0x15e/0x3e0 net/core/dev.c:6858
 [<ffffffff824b27de>] dev_change_flags+0x8e/0x140 net/core/dev.c:6926
 [<ffffffff824f5bf6>] dev_ifsioc+0x446/0x890 net/core/dev_ioctl.c:260
 [<ffffffff824f61fa>] dev_ioctl+0x1ba/0xb80 net/core/dev_ioctl.c:546
 [<ffffffff82430509>] sock_do_ioctl+0x99/0xb0 net/socket.c:879
 [<ffffffff82430d30>] sock_ioctl+0x2a0/0x390 net/socket.c:958
 [<ffffffff816f3b68>] vfs_ioctl fs/ioctl.c:44 [inline]
 [<ffffffff816f3b68>] do_vfs_ioctl+0x8a8/0xe50 fs/ioctl.c:611
 [<ffffffff816f41a4>] SYSC_ioctl fs/ioctl.c:626 [inline]
 [<ffffffff816f41a4>] SyS_ioctl+0x94/0xc0 fs/ioctl.c:617
 [<ffffffff8123e357>] entry_SYSCALL_64_fastpath+0x12/0x17

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
[bwh: Backported to 3.2:
 - Drop changes to sch_hhf (doesn't exist) and sch_sfq (doesn't have this bug)
 - Adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
7 years agoCIFS: remove endian related sparse warning
Steve French [Sun, 27 Aug 2017 21:56:08 +0000 (16:56 -0500)]
CIFS: remove endian related sparse warning

commit 6e3c1529c39e92ed64ca41d53abadabbaa1d5393 upstream.

Recent patch had an endian warning ie
cifs: return ENAMETOOLONG for overlong names in cifs_open()/cifs_lookup()

Signed-off-by: Steve French <smfrench@gmail.com>
CC: Ronnie Sahlberg <lsahlber@redhat.com>
Acked-by: Pavel Shilovsky <pshilov@microsoft.com>
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
7 years agoalpha: uapi: Add support for __SANE_USERSPACE_TYPES__
Ben Hutchings [Thu, 1 Oct 2015 00:35:55 +0000 (01:35 +0100)]
alpha: uapi: Add support for __SANE_USERSPACE_TYPES__

commit cec80d82142ab25c71eee24b529cfeaf17c43062 upstream.

This fixes compiler errors in perf such as:

tests/attr.c: In function 'store_event':
tests/attr.c:66:27: error: format '%llu' expects argument of type 'long long unsigned int', but argument 6 has type '__u64 {aka long unsigned int}' [-Werror=format=]
  snprintf(path, PATH_MAX, "%s/event-%d-%llu-%d", dir,
                           ^

Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Tested-by: Michael Cree <mcree@orcon.net.nz>
Signed-off-by: Matt Turner <mattst88@gmail.com>
7 years agocpumask: fix spurious cpumask_of_node() on non-NUMA multi-node configs
Tejun Heo [Mon, 28 Aug 2017 21:51:27 +0000 (14:51 -0700)]
cpumask: fix spurious cpumask_of_node() on non-NUMA multi-node configs

commit b339752d054fb32863418452dff350a1086885b1 upstream.

When !NUMA, cpumask_of_node(@node) equals cpu_online_mask regardless of
@node.  The assumption seems that if !NUMA, there shouldn't be more than
one node and thus reporting cpu_online_mask regardless of @node is
correct.  However, that assumption was broken years ago to support
DISCONTIGMEM and whether a system has multiple nodes or not is
separately controlled by NEED_MULTIPLE_NODES.

This means that, on a system with !NUMA && NEED_MULTIPLE_NODES,
cpumask_of_node() will report cpu_online_mask for all possible nodes,
indicating that the CPUs are associated with multiple nodes which is an
impossible configuration.

This bug has been around forever but doesn't look like it has caused any
noticeable symptoms.  However, it triggers a WARN recently added to
workqueue to verify NUMA affinity configuration.

Fix it by reporting empty cpumask on non-zero nodes if !NUMA.

Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-and-tested-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
7 years agoipv6: fix sparse warning on rt6i_node
Wei Wang [Fri, 25 Aug 2017 22:03:10 +0000 (15:03 -0700)]
ipv6: fix sparse warning on rt6i_node

commit 4e587ea71bf924f7dac621f1351653bd41e446cb upstream.

Commit c5cff8561d2d adds rcu grace period before freeing fib6_node. This
generates a new sparse warning on rt->rt6i_node related code:
  net/ipv6/route.c:1394:30: error: incompatible types in comparison
  expression (different address spaces)
  ./include/net/ip6_fib.h:187:14: error: incompatible types in comparison
  expression (different address spaces)

This commit adds "__rcu" tag for rt6i_node and makes sure corresponding
rcu API is used for it.
After this fix, sparse no longer generates the above warning.

Fixes: c5cff8561d2d ("ipv6: add rcu grace period before freeing fib6_node")
Signed-off-by: Wei Wang <weiwan@google.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
[bwh: Backported to 3.2:
 - fib6_add_rt2node() has only one assignment to update
 - Drop changes in rt6_cache_allowed_for_pmtu()
 - Adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
7 years agol2tp: hold tunnel used while creating sessions with netlink
Guillaume Nault [Fri, 25 Aug 2017 14:51:46 +0000 (16:51 +0200)]
l2tp: hold tunnel used while creating sessions with netlink

commit e702c1204eb57788ef189c839c8c779368267d70 upstream.

Use l2tp_tunnel_get() to retrieve tunnel, so that it can't go away on
us. Otherwise l2tp_tunnel_destruct() might release the last reference
count concurrently, thus freeing the tunnel while we're using it.

Fixes: 309795f4bec2 ("l2tp: Add netlink control API for L2TP")
Signed-off-by: Guillaume Nault <g.nault@alphalink.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
7 years agol2tp: remove useless duplicate session detection in l2tp_netlink
Guillaume Nault [Tue, 11 Apr 2017 11:12:13 +0000 (13:12 +0200)]
l2tp: remove useless duplicate session detection in l2tp_netlink

commit af87ae465abdc070de0dc35d6c6a9e7a8cd82987 upstream.

There's no point in checking for duplicate sessions at the beginning of
l2tp_nl_cmd_session_create(); the ->session_create() callbacks already
return -EEXIST when the session already exists.

Furthermore, even if l2tp_session_find() returns NULL, a new session
might be created right after the test. So relying on ->session_create()
to avoid duplicate session is the only sane behaviour.

Signed-off-by: Guillaume Nault <g.nault@alphalink.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
[bwh: Backported to 3.2: also delete the now-unused local variable]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
7 years agol2tp: hold tunnel while handling genl TUNNEL_GET commands
Guillaume Nault [Fri, 25 Aug 2017 14:51:43 +0000 (16:51 +0200)]
l2tp: hold tunnel while handling genl TUNNEL_GET commands

commit 4e4b21da3acc68a7ea55f850cacc13706b7480e9 upstream.

Use l2tp_tunnel_get() instead of l2tp_tunnel_find() so that we get
a reference on the tunnel, preventing l2tp_tunnel_destruct() from
freeing it from under us.

Also move l2tp_tunnel_get() below nlmsg_new() so that we only take
the reference when needed.

Fixes: 309795f4bec2 ("l2tp: Add netlink control API for L2TP")
Signed-off-by: Guillaume Nault <g.nault@alphalink.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
7 years agol2tp: hold tunnel while handling genl tunnel updates
Guillaume Nault [Fri, 25 Aug 2017 14:51:42 +0000 (16:51 +0200)]
l2tp: hold tunnel while handling genl tunnel updates

commit 8c0e421525c9eb50d68e8f633f703ca31680b746 upstream.

We need to make sure the tunnel is not going to be destroyed by
l2tp_tunnel_destruct() concurrently.

Fixes: 309795f4bec2 ("l2tp: Add netlink control API for L2TP")
Signed-off-by: Guillaume Nault <g.nault@alphalink.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
7 years agol2tp: hold tunnel while processing genl delete command
Guillaume Nault [Fri, 25 Aug 2017 14:51:42 +0000 (16:51 +0200)]
l2tp: hold tunnel while processing genl delete command

commit bb0a32ce4389e17e47e198d2cddaf141561581ad upstream.

l2tp_nl_cmd_tunnel_delete() needs to take a reference on the tunnel, to
prevent it from being concurrently freed by l2tp_tunnel_destruct().

Fixes: 309795f4bec2 ("l2tp: Add netlink control API for L2TP")
Signed-off-by: Guillaume Nault <g.nault@alphalink.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
7 years agol2tp: hold tunnel while looking up sessions in l2tp_netlink
Guillaume Nault [Fri, 25 Aug 2017 14:51:40 +0000 (16:51 +0200)]
l2tp: hold tunnel while looking up sessions in l2tp_netlink

commit 54652eb12c1b72e9602d09cb2821d5760939190f upstream.

l2tp_tunnel_find() doesn't take a reference on the returned tunnel.
Therefore, it's unsafe to use it because the returned tunnel can go
away on us anytime.

Fix this by defining l2tp_tunnel_get(), which works like
l2tp_tunnel_find(), but takes a reference on the returned tunnel.
Caller then has to drop this reference using l2tp_tunnel_dec_refcount().

As l2tp_tunnel_dec_refcount() needs to be moved to l2tp_core.h, let's
simplify the patch and not move the L2TP_REFCNT_DEBUG part. This code
has been broken (not even compiling) in May 2012 by
commit a4ca44fa578c ("net: l2tp: Standardize logging styles")
and fixed more than two years later by
commit 29abe2fda54f ("l2tp: fix missing line continuation"). So it
doesn't appear to be used by anyone.

Same thing for l2tp_tunnel_free(); instead of moving it to l2tp_core.h,
let's just simplify things and call kfree_rcu() directly in
l2tp_tunnel_dec_refcount(). Extra assertions and debugging code
provided by l2tp_tunnel_free() didn't help catching any of the
reference counting and socket handling issues found while working on
this series.

Fixes: 309795f4bec2 ("l2tp: Add netlink control API for L2TP")
Signed-off-by: Guillaume Nault <g.nault@alphalink.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
[bwh: Backported to 3.2: l2tp_tunnel_free() does more than just kfree_rcu(), so
 don't remove it]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
7 years agol2tp: define parameters of l2tp_session_get*() as "const"
Guillaume Nault [Wed, 12 Apr 2017 08:05:29 +0000 (10:05 +0200)]
l2tp: define parameters of l2tp_session_get*() as "const"

commit 9aaef50c44f132e040dcd7686c8e78a3390037c5 upstream.

Make l2tp_pernet()'s parameter constant, so that l2tp_session_get*() can
declare their "net" variable as "const".
Also constify "ifname" in l2tp_session_get_by_ifname().

Signed-off-by: Guillaume Nault <g.nault@alphalink.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
7 years agol2tp: initialise session's refcount before making it reachable
Guillaume Nault [Fri, 25 Aug 2017 14:22:17 +0000 (16:22 +0200)]
l2tp: initialise session's refcount before making it reachable

commit 9ee369a405c57613d7c83a3967780c3e30c52ecc upstream.

Sessions must be fully initialised before calling
l2tp_session_add_to_tunnel(). Otherwise, there's a short time frame
where partially initialised sessions can be accessed by external users.

Fixes: dbdbc73b4478 ("l2tp: fix duplicate session creation")
Signed-off-by: Guillaume Nault <g.nault@alphalink.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
[bwh: Backported to 3.2: keep using l2tp_session_inc_refcount()]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
7 years agodm: fix printk() rate limiting code
Bart Van Assche [Wed, 9 Aug 2017 18:32:11 +0000 (11:32 -0700)]
dm: fix printk() rate limiting code

commit 604407890ecf624c2fb41013c82b22aade59b455 upstream.

Using the same rate limiting state for different kinds of messages
is wrong because this can cause a high frequency message to suppress
a report of a low frequency message. Hence use a unique rate limiting
state per message type.

Fixes: 71a16736a15e ("dm: use local printk ratelimit")
Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
7 years agodm: convert DM printk macros to pr_<level> macros
Joe Perches [Thu, 20 Apr 2017 17:46:07 +0000 (10:46 -0700)]
dm: convert DM printk macros to pr_<level> macros

commit d2c3c8dcb5987b8352e82089c79a41b6e17e28d2 upstream.

Using pr_<level> is the more common logging style.

Standardize style and use new macro DM_FMT.
Use no_printk in DMDEBUG macros when CONFIG_DM_DEBUG is not #defined.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
7 years agoxfrm_user: fix info leak in build_aevent()
Mathias Krause [Sat, 26 Aug 2017 15:09:00 +0000 (17:09 +0200)]
xfrm_user: fix info leak in build_aevent()

commit 931e79d7a7ddee4709c56b39de169a36804589a1 upstream.

The memory reserved to dump the ID of the xfrm state includes a padding
byte in struct xfrm_usersa_id added by the compiler for alignment. To
prevent the heap info leak, memset(0) the sa_id before filling it.

Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Fixes: d51d081d6504 ("[IPSEC]: Sync series - user")
Signed-off-by: Mathias Krause <minipli@googlemail.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
7 years agoxfrm_user: fix info leak in xfrm_notify_sa()
Mathias Krause [Sat, 26 Aug 2017 15:08:58 +0000 (17:08 +0200)]
xfrm_user: fix info leak in xfrm_notify_sa()

commit 50329c8a340c9dea60d837645fcf13fc36bfb84d upstream.

The memory reserved to dump the ID of the xfrm state includes a padding
byte in struct xfrm_usersa_id added by the compiler for alignment. To
prevent the heap info leak, memset(0) the whole struct before filling
it.

Cc: Herbert Xu <herbert@gondor.apana.org.au>
Fixes: 0603eac0d6b7 ("[IPSEC]: Add XFRMA_SA/XFRMA_POLICY for delete notification")
Signed-off-by: Mathias Krause <minipli@googlemail.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
7 years agor8169: Do not increment tx_dropped in TX ring cleaning
Florian Fainelli [Fri, 25 Aug 2017 01:34:43 +0000 (18:34 -0700)]
r8169: Do not increment tx_dropped in TX ring cleaning

commit 1089650d8837095f63e001bbf14d7b48043d67ad upstream.

rtl8169_tx_clear_range() is responsible for cleaning up the TX ring
during interface shutdown, incrementing tx_dropped for every SKB that we
left at the time in the ring is misleading.

Fixes: cac4b22f3d6a ("r8169: do not account fragments as packets")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
7 years agoipv6: Fix may be used uninitialized warning in rt6_check
Steffen Klassert [Fri, 25 Aug 2017 07:05:42 +0000 (09:05 +0200)]
ipv6: Fix may be used uninitialized warning in rt6_check

commit 3614364527daa870264f6dde77f02853cdecd02c upstream.

rt_cookie might be used uninitialized, fix this by
initializing it.

Fixes: c5cff8561d2d ("ipv6: add rcu grace period before freeing fib6_node")
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
7 years agoipv6: add rcu grace period before freeing fib6_node
Wei Wang [Mon, 21 Aug 2017 16:47:10 +0000 (09:47 -0700)]
ipv6: add rcu grace period before freeing fib6_node

commit c5cff8561d2d0006e972bd114afd51f082fee77c upstream.

We currently keep rt->rt6i_node pointing to the fib6_node for the route.
And some functions make use of this pointer to dereference the fib6_node
from rt structure, e.g. rt6_check(). However, as there is neither
refcount nor rcu taken when dereferencing rt->rt6i_node, it could
potentially cause crashes as rt->rt6i_node could be set to NULL by other
CPUs when doing a route deletion.
This patch introduces an rcu grace period before freeing fib6_node and
makes sure the functions that dereference it takes rcu_read_lock().

Note: there is no "Fixes" tag because this bug was there in a very
early stage.

Signed-off-by: Wei Wang <weiwan@google.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
7 years agoipv6: Add rt6_get_cookie() function
Martin KaFai Lau [Sat, 23 May 2015 03:56:01 +0000 (20:56 -0700)]
ipv6: Add rt6_get_cookie() function

commit b197df4f0f3782782e9ea8996e91b65ae33e8dd9 upstream.

Instead of doing the rt6->rt6i_node check whenever we need
to get the route's cookie.  Refactor it into rt6_get_cookie().
It is a prep work to handle FLOWI_FLAG_KNOWN_NH and also
percpu rt6_info later.

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Cc: Julian Anastasov <ja@ssi.bg>
Signed-off-by: David S. Miller <davem@davemloft.net>
[bwh: Backported to 3.2:
 - Drop changes in inet6_sk_rx_dst_set(), sctp_v6_get_dst()
 - Adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
7 years agoPM/hibernate: touch NMI watchdog when creating snapshot
Chen Yu [Fri, 25 Aug 2017 22:55:30 +0000 (15:55 -0700)]
PM/hibernate: touch NMI watchdog when creating snapshot

commit 556b969a1cfe2686aae149137fa1dfcac0eefe54 upstream.

There is a problem that when counting the pages for creating the
hibernation snapshot will take significant amount of time, especially on
system with large memory.  Since the counting job is performed with irq
disabled, this might lead to NMI lockup.  The following warning were
found on a system with 1.5TB DRAM:

  Freezing user space processes ... (elapsed 0.002 seconds) done.
  OOM killer disabled.
  PM: Preallocating image memory...
  NMI watchdog: Watchdog detected hard LOCKUP on cpu 27
  CPU: 27 PID: 3128 Comm: systemd-sleep Not tainted 4.13.0-0.rc2.git0.1.fc27.x86_64 #1
  task: ffff9f01971ac000 task.stack: ffffb1a3f325c000
  RIP: 0010:memory_bm_find_bit+0xf4/0x100
  Call Trace:
   swsusp_set_page_free+0x2b/0x30
   mark_free_pages+0x147/0x1c0
   count_data_pages+0x41/0xa0
   hibernate_preallocate_memory+0x80/0x450
   hibernation_snapshot+0x58/0x410
   hibernate+0x17c/0x310
   state_store+0xdf/0xf0
   kobj_attr_store+0xf/0x20
   sysfs_kf_write+0x37/0x40
   kernfs_fop_write+0x11c/0x1a0
   __vfs_write+0x37/0x170
   vfs_write+0xb1/0x1a0
   SyS_write+0x55/0xc0
   entry_SYSCALL_64_fastpath+0x1a/0xa5
  ...
  done (allocated 6590003 pages)
  PM: Allocated 26360012 kbytes in 19.89 seconds (1325.28 MB/s)

It has taken nearly 20 seconds(2.10GHz CPU) thus the NMI lockup was
triggered.  In case the timeout of the NMI watch dog has been set to 1
second, a safe interval should be 6590003/20 = 320k pages in theory.
However there might also be some platforms running at a lower frequency,
so feed the watchdog every 100k pages.

[yu.c.chen@intel.com: simplification]
Link: http://lkml.kernel.org/r/1503460079-29721-1-git-send-email-yu.c.chen@intel.com
[yu.c.chen@intel.com: use interval of 128k instead of 100k to avoid modulus]
Link: http://lkml.kernel.org/r/1503328098-5120-1-git-send-email-yu.c.chen@intel.com
Signed-off-by: Chen Yu <yu.c.chen@intel.com>
Reported-by: Jan Filipcewicz <jan.filipcewicz@intel.com>
Suggested-by: Michal Hocko <mhocko@suse.com>
Reviewed-by: Michal Hocko <mhocko@suse.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Len Brown <lenb@kernel.org>
Cc: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
7 years agoperf/core: Fix group {cpu,task} validation
Mark Rutland [Thu, 22 Jun 2017 14:41:38 +0000 (15:41 +0100)]
perf/core: Fix group {cpu,task} validation

commit 64aee2a965cf2954a038b5522f11d2cd2f0f8f3e upstream.

Regardless of which events form a group, it does not make sense for the
events to target different tasks and/or CPUs, as this leaves the group
inconsistent and impossible to schedule. The core perf code assumes that
these are consistent across (successfully intialised) groups.

Core perf code only verifies this when moving SW events into a HW
context. Thus, we can violate this requirement for pure SW groups and
pure HW groups, unless the relevant PMU driver happens to perform this
verification itself. These mismatched groups subsequently wreak havoc
elsewhere.

For example, we handle watchpoints as SW events, and reserve watchpoint
HW on a per-CPU basis at pmu::event_init() time to ensure that any event
that is initialised is guaranteed to have a slot at pmu::add() time.
However, the core code only checks the group leader's cpu filter (via
event_filter_match()), and can thus install follower events onto CPUs
violating thier (mismatched) CPU filters, potentially installing them
into a CPU without sufficient reserved slots.

This can be triggered with the below test case, resulting in warnings
from arch backends.

  #define _GNU_SOURCE
  #include <linux/hw_breakpoint.h>
  #include <linux/perf_event.h>
  #include <sched.h>
  #include <stdio.h>
  #include <sys/prctl.h>
  #include <sys/syscall.h>
  #include <unistd.h>

  static int perf_event_open(struct perf_event_attr *attr, pid_t pid, int cpu,
   int group_fd, unsigned long flags)
  {
return syscall(__NR_perf_event_open, attr, pid, cpu, group_fd, flags);
  }

  char watched_char;

  struct perf_event_attr wp_attr = {
.type = PERF_TYPE_BREAKPOINT,
.bp_type = HW_BREAKPOINT_RW,
.bp_addr = (unsigned long)&watched_char,
.bp_len = 1,
.size = sizeof(wp_attr),
  };

  int main(int argc, char *argv[])
  {
int leader, ret;
cpu_set_t cpus;

/*
 * Force use of CPU0 to ensure our CPU0-bound events get scheduled.
 */
CPU_ZERO(&cpus);
CPU_SET(0, &cpus);
ret = sched_setaffinity(0, sizeof(cpus), &cpus);
if (ret) {
printf("Unable to set cpu affinity\n");
return 1;
}

/* open leader event, bound to this task, CPU0 only */
leader = perf_event_open(&wp_attr, 0, 0, -1, 0);
if (leader < 0) {
printf("Couldn't open leader: %d\n", leader);
return 1;
}

/*
 * Open a follower event that is bound to the same task, but a
 * different CPU. This means that the group should never be possible to
 * schedule.
 */
ret = perf_event_open(&wp_attr, 0, 1, leader, 0);
if (ret < 0) {
printf("Couldn't open mismatched follower: %d\n", ret);
return 1;
} else {
printf("Opened leader/follower with mismastched CPUs\n");
}

/*
 * Open as many independent events as we can, all bound to the same
 * task, CPU0 only.
 */
do {
ret = perf_event_open(&wp_attr, 0, 0, -1, 0);
} while (ret >= 0);

/*
 * Force enable/disble all events to trigger the erronoeous
 * installation of the follower event.
 */
printf("Opened all events. Toggling..\n");
for (;;) {
prctl(PR_TASK_PERF_EVENTS_DISABLE, 0, 0, 0, 0);
prctl(PR_TASK_PERF_EVENTS_ENABLE, 0, 0, 0, 0);
}

return 0;
  }

Fix this by validating this requirement regardless of whether we're
moving events.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Zhou Chengming <zhouchengming1@huawei.com>
Link: http://lkml.kernel.org/r/1498142498-15758-1-git-send-email-mark.rutland@arm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
7 years agoperf: Tighten (and fix) the grouping condition
Peter Zijlstra [Fri, 23 Jan 2015 10:19:48 +0000 (11:19 +0100)]
perf: Tighten (and fix) the grouping condition

commit c3c87e770458aa004bd7ed3f29945ff436fd6511 upstream.

The fix from 9fc81d87420d ("perf: Fix events installation during
moving group") was incomplete in that it failed to recognise that
creating a group with events for different CPUs is semantically
broken -- they cannot be co-scheduled.

Furthermore, it leads to real breakage where, when we create an event
for CPU Y and then migrate it to form a group on CPU X, the code gets
confused where the counter is programmed -- triggered in practice
as well by me via the perf fuzzer.

Fix this by tightening the rules for creating groups. Only allow
grouping of counters that can be co-scheduled in the same context.
This means for the same task and/or the same cpu.

Fixes: 9fc81d87420d ("perf: Fix events installation during moving group")
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/20150123125834.090683288@infradead.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
7 years agoqlge: avoid memcpy buffer overflow
Arnd Bergmann [Wed, 23 Aug 2017 13:59:49 +0000 (15:59 +0200)]
qlge: avoid memcpy buffer overflow

commit e58f95831e7468d25eb6e41f234842ecfe6f014f upstream.

gcc-8.0.0 (snapshot) points out that we copy a variable-length string
into a fixed length field using memcpy() with the destination length,
and that ends up copying whatever follows the string:

    inlined from 'ql_core_dump' at drivers/net/ethernet/qlogic/qlge/qlge_dbg.c:1106:2:
drivers/net/ethernet/qlogic/qlge/qlge_dbg.c:708:2: error: 'memcpy' reading 15 bytes from a region of size 14 [-Werror=stringop-overflow=]
  memcpy(seg_hdr->description, desc, (sizeof(seg_hdr->description)) - 1);

Changing it to use strncpy() will instead zero-pad the destination,
which seems to be the right thing to do here.

The bug is probably harmless, but it seems like a good idea to address
it in stable kernels as well, if only for the purpose of building with
gcc-8 without warnings.

Fixes: a61f80261306 ("qlge: Add ethtool register dump function.")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
7 years agocifs: return ENAMETOOLONG for overlong names in cifs_open()/cifs_lookup()
Ronnie Sahlberg [Wed, 23 Aug 2017 04:48:14 +0000 (14:48 +1000)]
cifs: return ENAMETOOLONG for overlong names in cifs_open()/cifs_lookup()

commit d3edede29f74d335f81d95a4588f5f136a9f7dcf upstream.

Add checking for the path component length and verify it is <= the maximum
that the server advertizes via FileFsAttributeInformation.

With this patch cifs.ko will now return ENAMETOOLONG instead of ENOENT
when users to access an overlong path.

To test this, try to cd into a (non-existing) directory on a CIFS share
that has a too long name:
cd /mnt/aaaaaaaaaaaaaaa...

and it now should show a good error message from the shell:
bash: cd: /mnt/aaaaaaaaaaaaaaaa...aaaaaa: File name too long

rh bz 1153996

Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <smfrench@gmail.com>
[bwh: Backported to 3.2: name checks are done only in cifs_lookup()]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
7 years agoALSA: hda - Add stereo mic quirk for Lenovo G50-70 (17aa:3978)
Takashi Iwai [Wed, 23 Aug 2017 07:30:17 +0000 (09:30 +0200)]
ALSA: hda - Add stereo mic quirk for Lenovo G50-70 (17aa:3978)

commit bbba6f9d3da357bbabc6fda81e99ff5584500e76 upstream.

Lenovo G50-70 (17aa:3978) with Conexant codec chip requires the
similar workaround for the inverted stereo dmic like other Lenovo
models.

Bugzilla: https://bugzilla.suse.com/show_bug.cgi?id=1020657
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
7 years agoipv6: accept 64k - 1 packet length in ip6_find_1stfragopt()
Stefano Brivio [Fri, 18 Aug 2017 12:40:53 +0000 (14:40 +0200)]
ipv6: accept 64k - 1 packet length in ip6_find_1stfragopt()

commit 3de33e1ba0506723ab25734e098cf280ecc34756 upstream.

A packet length of exactly IPV6_MAXPLEN is allowed, we should
refuse parsing options only if the size is 64KiB or more.

While at it, remove one extra variable and one assignment which
were also introduced by the commit that introduced the size
check. Checking the sum 'offset + len' and only later adding
'len' to 'offset' doesn't provide any advantage over directly
summing to 'offset' and checking it.

Fixes: 6399f1fae4ec ("ipv6: avoid overflow of offset in ip6_find_1stfragopt")
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
[bwh: Backported to 3.2: adjust filename]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
7 years agoALSA: core: Fix unexpected error at replacing user TLV
Takashi Iwai [Tue, 22 Aug 2017 06:15:13 +0000 (08:15 +0200)]
ALSA: core: Fix unexpected error at replacing user TLV

commit 88c54cdf61f508ebcf8da2d819f5dfc03e954d1d upstream.

When user tries to replace the user-defined control TLV, the kernel
checks the change of its content via memcmp().  The problem is that
the kernel passes the return value from memcmp() as is.  memcmp()
gives a non-zero negative value depending on the comparison result,
and this shall be recognized as an error code.

The patch covers that corner-case, return 1 properly for the changed
TLV.

Fixes: 8aa9b586e420 ("[ALSA] Control API - more robust TLV implementation")
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
7 years agoInput: trackpoint - add new trackpoint firmware ID
Aaron Ma [Fri, 18 Aug 2017 19:17:21 +0000 (12:17 -0700)]
Input: trackpoint - add new trackpoint firmware ID

commit ec667683c532c93fb41e100e5d61a518971060e2 upstream.

Synaptics add new TP firmware ID: 0x2 and 0x3, for now both lower 2 bits
are indicated as TP. Change the constant to bitwise values.

This makes trackpoint to be recognized on Lenovo Carbon X1 Gen5 instead
of it being identified as "PS/2 Generic Mouse".

Signed-off-by: Aaron Ma <aaron.ma@canonical.com>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
7 years agomm/mempolicy: fix use after free when calling get_mempolicy
zhong jiang [Fri, 18 Aug 2017 22:16:24 +0000 (15:16 -0700)]
mm/mempolicy: fix use after free when calling get_mempolicy

commit 73223e4e2e3867ebf033a5a8eb2e5df0158ccc99 upstream.

I hit a use after free issue when executing trinity and repoduced it
with KASAN enabled.  The related call trace is as follows.

  BUG: KASan: use after free in SyS_get_mempolicy+0x3c8/0x960 at addr ffff8801f582d766
  Read of size 2 by task syz-executor1/798

  INFO: Allocated in mpol_new.part.2+0x74/0x160 age=3 cpu=1 pid=799
     __slab_alloc+0x768/0x970
     kmem_cache_alloc+0x2e7/0x450
     mpol_new.part.2+0x74/0x160
     mpol_new+0x66/0x80
     SyS_mbind+0x267/0x9f0
     system_call_fastpath+0x16/0x1b
  INFO: Freed in __mpol_put+0x2b/0x40 age=4 cpu=1 pid=799
     __slab_free+0x495/0x8e0
     kmem_cache_free+0x2f3/0x4c0
     __mpol_put+0x2b/0x40
     SyS_mbind+0x383/0x9f0
     system_call_fastpath+0x16/0x1b
  INFO: Slab 0xffffea0009cb8dc0 objects=23 used=8 fp=0xffff8801f582de40 flags=0x200000000004080
  INFO: Object 0xffff8801f582d760 @offset=5984 fp=0xffff8801f582d600

  Bytes b4 ffff8801f582d750: ae 01 ff ff 00 00 00 00 5a 5a 5a 5a 5a 5a 5a 5a  ........ZZZZZZZZ
  Object ffff8801f582d760: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
  Object ffff8801f582d770: 6b 6b 6b 6b 6b 6b 6b a5                          kkkkkkk.
  Redzone ffff8801f582d778: bb bb bb bb bb bb bb bb                          ........
  Padding ffff8801f582d8b8: 5a 5a 5a 5a 5a 5a 5a 5a                          ZZZZZZZZ
  Memory state around the buggy address:
  ffff8801f582d600: fb fb fb fc fc fc fc fc fc fc fc fc fc fc fc fc
  ffff8801f582d680: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
  >ffff8801f582d700: fc fc fc fc fc fc fc fc fc fc fc fc fb fb fb fc

!shared memory policy is not protected against parallel removal by other
thread which is normally protected by the mmap_sem.  do_get_mempolicy,
however, drops the lock midway while we can still access it later.

Early premature up_read is a historical artifact from times when
put_user was called in this path see https://lwn.net/Articles/124754/
but that is gone since 8bccd85ffbaf ("[PATCH] Implement sys_* do_*
layering in the memory policy layer.").  but when we have the the
current mempolicy ref count model.  The issue was introduced
accordingly.

Fix the issue by removing the premature release.

Link: http://lkml.kernel.org/r/1502950924-27521-1-git-send-email-zhongjiang@huawei.com
Signed-off-by: zhong jiang <zhongjiang@huawei.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: David Rientjes <rientjes@google.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
7 years agoALSA: usb-audio: Add mute TLV for playback volumes on C-Media devices
Takashi Iwai [Wed, 16 Aug 2017 12:18:37 +0000 (14:18 +0200)]
ALSA: usb-audio: Add mute TLV for playback volumes on C-Media devices

commit 0f174b3525a43bd51f9397394763925e0ebe7bc7 upstream.

C-Media devices (at least some models) mute the playback stream when
volumes are set to the minimum value.  But this isn't informed via TLV
and the user-space, typically PulseAudio, gets confused as if it's
still played in a low volume.

This patch adds the new flag, min_mute, to struct usb_mixer_elem_info
for indicating that the mixer element is with the minimum-mute volume.
This flag is set for known C-Media devices in
snd_usb_mixer_fu_apply_quirk() in turn.

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=196669
Signed-off-by: Takashi Iwai <tiwai@suse.de>
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
7 years agoparisc: pci memory bar assignment fails with 64bit kernels on dino/cujo
Thomas Bogendoerfer [Sat, 12 Aug 2017 21:36:47 +0000 (23:36 +0200)]
parisc: pci memory bar assignment fails with 64bit kernels on dino/cujo

commit 4098116039911e8870d84c975e2ec22dab65a909 upstream.

For 64bit kernels the lmmio_space_offset of the host bridge window
isn't set correctly on systems with dino/cujo PCI host bridges.
This leads to not assigned memory bars and failing drivers, which
need to use these bars.

Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Acked-by: Helge Deller <deller@gmx.de>
Signed-off-by: Helge Deller <deller@gmx.de>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
7 years agoaudit: Fix use after free in audit_remove_watch_rule()
Jan Kara [Tue, 15 Aug 2017 11:00:36 +0000 (13:00 +0200)]
audit: Fix use after free in audit_remove_watch_rule()

commit d76036ab47eafa6ce52b69482e91ca3ba337d6d6 upstream.

audit_remove_watch_rule() drops watch's reference to parent but then
continues to work with it. That is not safe as parent can get freed once
we drop our reference. The following is a trivial reproducer:

mount -o loop image /mnt
touch /mnt/file
auditctl -w /mnt/file -p wax
umount /mnt
auditctl -D
<crash in fsnotify_destroy_mark()>

Grab our own reference in audit_remove_watch_rule() earlier to make sure
mark does not get freed under us.

Reported-by: Tony Jones <tonyj@suse.de>
Signed-off-by: Jan Kara <jack@suse.cz>
Tested-by: Tony Jones <tonyj@suse.de>
Signed-off-by: Paul Moore <paul@paul-moore.com>
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
7 years agoaf_key: do not use GFP_KERNEL in atomic contexts
Eric Dumazet [Mon, 14 Aug 2017 17:16:45 +0000 (10:16 -0700)]
af_key: do not use GFP_KERNEL in atomic contexts

commit 36f41f8fc6d8aa9f8c9072d66ff7cf9055f5e69b upstream.

pfkey_broadcast() might be called from non process contexts,
we can not use GFP_KERNEL in these cases [1].

This patch partially reverts commit ba51b6be38c1 ("net: Fix RCU splat in
af_key"), only keeping the GFP_ATOMIC forcing under rcu_read_lock()
section.

[1] : syzkaller reported :

in_atomic(): 1, irqs_disabled(): 0, pid: 2932, name: syzkaller183439
3 locks held by syzkaller183439/2932:
 #0:  (&net->xfrm.xfrm_cfg_mutex){+.+.+.}, at: [<ffffffff83b43888>] pfkey_sendmsg+0x4c8/0x9f0 net/key/af_key.c:3649
 #1:  (&pfk->dump_lock){+.+.+.}, at: [<ffffffff83b467f6>] pfkey_do_dump+0x76/0x3f0 net/key/af_key.c:293
 #2:  (&(&net->xfrm.xfrm_policy_lock)->rlock){+...+.}, at: [<ffffffff83957632>] spin_lock_bh include/linux/spinlock.h:304 [inline]
 #2:  (&(&net->xfrm.xfrm_policy_lock)->rlock){+...+.}, at: [<ffffffff83957632>] xfrm_policy_walk+0x192/0xa30 net/xfrm/xfrm_policy.c:1028
CPU: 0 PID: 2932 Comm: syzkaller183439 Not tainted 4.13.0-rc4+ #24
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
 __dump_stack lib/dump_stack.c:16 [inline]
 dump_stack+0x194/0x257 lib/dump_stack.c:52
 ___might_sleep+0x2b2/0x470 kernel/sched/core.c:5994
 __might_sleep+0x95/0x190 kernel/sched/core.c:5947
 slab_pre_alloc_hook mm/slab.h:416 [inline]
 slab_alloc mm/slab.c:3383 [inline]
 kmem_cache_alloc+0x24b/0x6e0 mm/slab.c:3559
 skb_clone+0x1a0/0x400 net/core/skbuff.c:1037
 pfkey_broadcast_one+0x4b2/0x6f0 net/key/af_key.c:207
 pfkey_broadcast+0x4ba/0x770 net/key/af_key.c:281
 dump_sp+0x3d6/0x500 net/key/af_key.c:2685
 xfrm_policy_walk+0x2f1/0xa30 net/xfrm/xfrm_policy.c:1042
 pfkey_dump_sp+0x42/0x50 net/key/af_key.c:2695
 pfkey_do_dump+0xaa/0x3f0 net/key/af_key.c:299
 pfkey_spddump+0x1a0/0x210 net/key/af_key.c:2722
 pfkey_process+0x606/0x710 net/key/af_key.c:2814
 pfkey_sendmsg+0x4d6/0x9f0 net/key/af_key.c:3650
sock_sendmsg_nosec net/socket.c:633 [inline]
 sock_sendmsg+0xca/0x110 net/socket.c:643
 ___sys_sendmsg+0x755/0x890 net/socket.c:2035
 __sys_sendmsg+0xe5/0x210 net/socket.c:2069
 SYSC_sendmsg net/socket.c:2080 [inline]
 SyS_sendmsg+0x2d/0x50 net/socket.c:2076
 entry_SYSCALL_64_fastpath+0x1f/0xbe
RIP: 0033:0x445d79
RSP: 002b:00007f32447c1dc8 EFLAGS: 00000202 ORIG_RAX: 000000000000002e
RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 0000000000445d79
RDX: 0000000000000000 RSI: 000000002023dfc8 RDI: 0000000000000008
RBP: 0000000000000086 R08: 00007f32447c2700 R09: 00007f32447c2700
R10: 00007f32447c2700 R11: 0000000000000202 R12: 0000000000000000
R13: 00007ffe33edec4f R14: 00007f32447c29c0 R15: 0000000000000000

Fixes: ba51b6be38c1 ("net: Fix RCU splat in af_key")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Cc: David Ahern <dsa@cumulusnetworks.com>
Acked-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
7 years agoxfs: fix inobt inode allocation search optimization
Omar Sandoval [Fri, 11 Aug 2017 16:00:06 +0000 (09:00 -0700)]
xfs: fix inobt inode allocation search optimization

commit c44245b3d5435f533ca8346ece65918f84c057f9 upstream.

When we try to allocate a free inode by searching the inobt, we try to
find the inode nearest the parent inode by searching chunks both left
and right of the chunk containing the parent. As an optimization, we
cache the leftmost and rightmost records that we previously searched; if
we do another allocation with the same parent inode, we'll pick up the
search where it last left off.

There's a bug in the case where we found a free inode to the left of the
parent's chunk: we need to update the cached left and right records, but
because we already reassigned the right record to point to the left, we
end up assigning the left record to both the cached left and right
records.

This isn't a correctness problem strictly, but it can result in the next
allocation rechecking chunks unnecessarily or allocating inodes further
away from the parent than it needs to. Fix it by swapping the record
pointer after we update the cached left and right records.

Fixes: bd169565993b ("xfs: speed up free inode search")
Signed-off-by: Omar Sandoval <osandov@fb.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
[bwh: Backported to 3.2: adjust filename]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
7 years agoIB/uverbs: Fix device cleanup
Yishai Hadas [Tue, 1 Aug 2017 06:41:36 +0000 (09:41 +0300)]
IB/uverbs: Fix device cleanup

commit efdd6f53b10aead0f5cf19a93dd3eb268ac0d991 upstream.

Uverbs device should be cleaned up only when there is no
potential usage of.

As part of ib_uverbs_remove_one which might be triggered upon reset flow
the device reference count is decreased as expected and leave the final
cleanup to the FDs that were opened.

Current code increases reference count upon opening a new command FD and
decreases it upon closing the file. The event FD is opened internally
and rely on the command FD by taking on it a reference count.

In case that the command FD was closed and just later the event FD we
may ensure that the device resources as of srcu are still alive as they
are still in use.

Fixing the above by moving the reference count decreasing to the place
where the command FD is really freed instead of doing that when it was
just closed.

fixes: 036b10635739 ("IB/uverbs: Enable device removal when there are active user space applications")
Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
Reviewed-by: Matan Barak <matanb@mellanox.com>
Reviewed-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
Tested-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
7 years agoRDMA/uverbs: Prevent leak of reserved field
Leon Romanovsky [Tue, 1 Aug 2017 06:41:35 +0000 (09:41 +0300)]
RDMA/uverbs: Prevent leak of reserved field

commit f7a6cb7b38c6845b26aaa8bbdf519ff6e3090831 upstream.

initialize to zero the response structure to prevent
the leakage of "resp.reserved" field.

drivers/infiniband/core/uverbs_cmd.c:1178 ib_uverbs_resize_cq() warn:
check that 'resp.reserved' doesn't leak information

Fixes: 33b9b3ee9709 ("IB: Add userspace support for resizing CQs")
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
7 years agoocfs2: don't clear SGID when inheriting ACLs
Jan Kara [Wed, 2 Aug 2017 20:32:30 +0000 (13:32 -0700)]
ocfs2: don't clear SGID when inheriting ACLs

commit 19ec8e48582670c021e998b9deb88e39a842ff45 upstream.

When new directory 'DIR1' is created in a directory 'DIR0' with SGID bit
set, DIR1 is expected to have SGID bit set (and owning group equal to
the owning group of 'DIR0').  However when 'DIR0' also has some default
ACLs that 'DIR1' inherits, setting these ACLs will result in SGID bit on
'DIR1' to get cleared if user is not member of the owning group.

Fix the problem by moving posix_acl_update_mode() out of ocfs2_set_acl()
into ocfs2_iop_set_acl().  That way the function will not be called when
inheriting ACLs which is what we want as it prevents SGID bit clearing
and the mode has been properly set by posix_acl_create() anyway.  Also
posix_acl_chmod() that is calling ocfs2_set_acl() takes care of updating
mode itself.

Fixes: 073931017b4 ("posix_acl: Clear SGID bit when setting file permissions")
Link: http://lkml.kernel.org/r/20170801141252.19675-3-jack@suse.cz
Signed-off-by: Jan Kara <jack@suse.cz>
Cc: Mark Fasheh <mfasheh@versity.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Joseph Qi <jiangqi903@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
[bwh: Backported to 3.2: Move the call to posix_acl_update_mode() into
 ocfs2_xattr_set_acl(). Pass NULL as the bh argument to
 ocfs2_acl_set_mode(). Reuse the existing cleanup label.]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
7 years agonet/mlx4_en: Fix wrong indication of Wake-on-LAN (WoL) support
Inbar Karmy [Tue, 1 Aug 2017 13:43:43 +0000 (16:43 +0300)]
net/mlx4_en: Fix wrong indication of Wake-on-LAN (WoL) support

commit c994f778bb1cca8ebe7a4e528cefec233e93b5cc upstream.

Currently when WoL is supported but disabled, ethtool reports:
"Supports Wake-on: d".
Fix the indication of Wol support, so that the indication
remains "g" all the time if the NIC supports WoL.

Tested:
As accepted, when NIC supports WoL- ethtool reports:
Supports Wake-on: g
Wake-on: d
when NIC doesn't support WoL- ethtool reports:
        Supports Wake-on: d
        Wake-on: d

Fixes: 14c07b1358ed ("mlx4: Wake on LAN support")
Signed-off-by: Inbar Karmy <inbark@mellanox.com>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
7 years agogpio: tegra: fix unbalanced chained_irq_enter/exit
Michał Mirosław [Tue, 18 Jul 2017 12:35:45 +0000 (14:35 +0200)]
gpio: tegra: fix unbalanced chained_irq_enter/exit

commit 9e9509e38fbe034782339eb09c915f0b5765ff69 upstream.

When more than one GPIO IRQs are triggered simultaneously,
tegra_gpio_irq_handler() called chained_irq_exit() multiple
times for one chained_irq_enter().

Fixes: 3c92db9ac0ca3eee8e46e2424b6c074e2e394ad9
Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
[Also changed the variable to a bool]
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
7 years agoxtensa: mm/cache: add missing EXPORT_SYMBOLs
Max Filippov [Tue, 1 Aug 2017 18:15:15 +0000 (11:15 -0700)]
xtensa: mm/cache: add missing EXPORT_SYMBOLs

commit bc652eb6a0d5cffaea7dc8e8ad488aab2a1bf1ed upstream.

Functions clear_user_highpage, copy_user_highpage, flush_dcache_page,
local_flush_cache_range and local_flush_cache_page may be used from
modules. Export them.

Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
[bwh: Backported to 3.2: drop exports of {clear,copy}_user_highpage()]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
7 years agoxtensa: don't limit csum_partial export by CONFIG_NET
Max Filippov [Tue, 1 Aug 2017 18:02:46 +0000 (11:02 -0700)]
xtensa: don't limit csum_partial export by CONFIG_NET

commit 7f81e55c737a8fa82c71f290945d729a4902f8d2 upstream.

csum_partial and csum_partial_copy_generic are defined unconditionally
and are available even when CONFIG_NET is disabled. They are used not
only by the network drivers, but also by scsi and media.
Don't limit these functions export by CONFIG_NET.

Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
7 years agoxtensa: add missing symbol exports
Max Filippov [Mon, 17 Sep 2012 01:44:56 +0000 (05:44 +0400)]
xtensa: add missing symbol exports

commit d3738f407c8ced4fd17dccf6cce729023c735c73 upstream.

Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
Signed-off-by: Chris Zankel <chris@zankel.net>
[bwh: Backported to 3.2: drop exports of some functions that aren't defined here]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
7 years agoUSB: hcd: Mark secondary HCD as dead if the primary one died
Rafael J. Wysocki [Tue, 25 Jul 2017 21:58:50 +0000 (23:58 +0200)]
USB: hcd: Mark secondary HCD as dead if the primary one died

commit cd5a6a4fdaba150089af2afc220eae0fef74878a upstream.

Make usb_hc_died() clear the HCD_FLAG_RH_RUNNING flag for the shared
HCD and set HCD_FLAG_DEAD for it, in analogy with what is done for
the primary one.

Among other thigs, this prevents check_root_hub_suspended() from
returning -EBUSY for dead HCDs which helps to work around system
suspend issues in some situations.

This actually fixes occasional suspend failures on one of my test
machines.

Suggested-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
7 years agoxtensa: fix cache aliasing handling code for WT cache
Max Filippov [Sat, 29 Jul 2017 00:42:59 +0000 (17:42 -0700)]
xtensa: fix cache aliasing handling code for WT cache

commit 6d0f581d1768d3eaba15776e7dd1fdfec10cfe36 upstream.

Currently building kernel for xtensa core with aliasing WT cache fails
with the following messages:

  mm/memory.c:2152: undefined reference to `flush_dcache_page'
  mm/memory.c:2332: undefined reference to `local_flush_cache_page'
  mm/memory.c:1919: undefined reference to `local_flush_cache_range'
  mm/memory.c:4179: undefined reference to `copy_to_user_page'
  mm/memory.c:4183: undefined reference to `copy_from_user_page'

This happens because implementation of these functions is only compiled
when data cache is WB, which looks wrong: even when data cache doesn't
need flushing it still needs invalidation. The functions like
__flush_[invalidate_]dcache_* are correctly defined for both WB and WT
caches (and even if they weren't that'd still be ok, just slower).

Fix this by providing the same implementation of the above functions for
both WB and WT cache.

Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
7 years agoARM: pxa: select both FB and FB_W100 for eseries
Arnd Bergmann [Wed, 19 Mar 2014 17:41:37 +0000 (18:41 +0100)]
ARM: pxa: select both FB and FB_W100 for eseries

commit 1d20d8a9fce8f1e2ef00a0f3d068fa18d59ddf8f upstream.

We get a link error trying to access the w100fb_gpio_read/write
functions from the platform when the driver is a loadable module
or not built-in, so the platform already uses 'select' to hard-enable
the driver.

However, that fails if the framebuffer subsystem is disabled
altogether.

I've considered various ways to fix this properly, but they
all seem like too much work or too risky, so this simply
adds another 'select' to force the subsystem on as well.

Fixes: 82427de2c7c3 ("ARM: pxa: PXA_ESERIES depends on FB_W100.")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
7 years agosctp: fix the check for _sctp_walk_params and _sctp_walk_errors
Xin Long [Wed, 26 Jul 2017 08:24:59 +0000 (16:24 +0800)]
sctp: fix the check for _sctp_walk_params and _sctp_walk_errors

commit 6b84202c946cd3da3a8daa92c682510e9ed80321 upstream.

Commit b1f5bfc27a19 ("sctp: don't dereference ptr before leaving
_sctp_walk_{params, errors}()") tried to fix the issue that it
may overstep the chunk end for _sctp_walk_{params, errors} with
'chunk_end > offset(length) + sizeof(length)'.

But it introduced a side effect: When processing INIT, it verifies
the chunks with 'param.v == chunk_end' after iterating all params
by sctp_walk_params(). With the check 'chunk_end > offset(length)
+ sizeof(length)', it would return when the last param is not yet
accessed. Because the last param usually is fwdtsn supported param
whose size is 4 and 'chunk_end == offset(length) + sizeof(length)'

This is a badly issue even causing sctp couldn't process 4-shakes.
Client would always get abort when connecting to server, due to
the failure of INIT chunk verification on server.

The patch is to use 'chunk_end <= offset(length) + sizeof(length)'
instead of 'chunk_end < offset(length) + sizeof(length)' for both
_sctp_walk_params and _sctp_walk_errors.

Fixes: b1f5bfc27a19 ("sctp: don't dereference ptr before leaving _sctp_walk_{params, errors}()")
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
7 years agosctp: don't dereference ptr before leaving _sctp_walk_{params, errors}()
Alexander Potapenko [Fri, 14 Jul 2017 16:32:45 +0000 (18:32 +0200)]
sctp: don't dereference ptr before leaving _sctp_walk_{params, errors}()

commit b1f5bfc27a19f214006b9b4db7b9126df2dfdf5a upstream.

If the length field of the iterator (|pos.p| or |err|) is past the end
of the chunk, we shouldn't access it.

This bug has been detected by KMSAN. For the following pair of system
calls:

  socket(PF_INET6, SOCK_STREAM, 0x84 /* IPPROTO_??? */) = 3
  sendto(3, "A", 1, MSG_OOB, {sa_family=AF_INET6, sin6_port=htons(0),
         inet_pton(AF_INET6, "::1", &sin6_addr), sin6_flowinfo=0,
         sin6_scope_id=0}, 28) = 1

the tool has reported a use of uninitialized memory:

  ==================================================================
  BUG: KMSAN: use of uninitialized memory in sctp_rcv+0x17b8/0x43b0
  CPU: 1 PID: 2940 Comm: probe Not tainted 4.11.0-rc5+ #2926
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs
  01/01/2011
  Call Trace:
   <IRQ>
   __dump_stack lib/dump_stack.c:16
   dump_stack+0x172/0x1c0 lib/dump_stack.c:52
   kmsan_report+0x12a/0x180 mm/kmsan/kmsan.c:927
   __msan_warning_32+0x61/0xb0 mm/kmsan/kmsan_instr.c:469
   __sctp_rcv_init_lookup net/sctp/input.c:1074
   __sctp_rcv_lookup_harder net/sctp/input.c:1233
   __sctp_rcv_lookup net/sctp/input.c:1255
   sctp_rcv+0x17b8/0x43b0 net/sctp/input.c:170
   sctp6_rcv+0x32/0x70 net/sctp/ipv6.c:984
   ip6_input_finish+0x82f/0x1ee0 net/ipv6/ip6_input.c:279
   NF_HOOK ./include/linux/netfilter.h:257
   ip6_input+0x239/0x290 net/ipv6/ip6_input.c:322
   dst_input ./include/net/dst.h:492
   ip6_rcv_finish net/ipv6/ip6_input.c:69
   NF_HOOK ./include/linux/netfilter.h:257
   ipv6_rcv+0x1dbd/0x22e0 net/ipv6/ip6_input.c:203
   __netif_receive_skb_core+0x2f6f/0x3a20 net/core/dev.c:4208
   __netif_receive_skb net/core/dev.c:4246
   process_backlog+0x667/0xba0 net/core/dev.c:4866
   napi_poll net/core/dev.c:5268
   net_rx_action+0xc95/0x1590 net/core/dev.c:5333
   __do_softirq+0x485/0x942 kernel/softirq.c:284
   do_softirq_own_stack+0x1c/0x30 arch/x86/entry/entry_64.S:902
   </IRQ>
   do_softirq kernel/softirq.c:328
   __local_bh_enable_ip+0x25b/0x290 kernel/softirq.c:181
   local_bh_enable+0x37/0x40 ./include/linux/bottom_half.h:31
   rcu_read_unlock_bh ./include/linux/rcupdate.h:931
   ip6_finish_output2+0x19b2/0x1cf0 net/ipv6/ip6_output.c:124
   ip6_finish_output+0x764/0x970 net/ipv6/ip6_output.c:149
   NF_HOOK_COND ./include/linux/netfilter.h:246
   ip6_output+0x456/0x520 net/ipv6/ip6_output.c:163
   dst_output ./include/net/dst.h:486
   NF_HOOK ./include/linux/netfilter.h:257
   ip6_xmit+0x1841/0x1c00 net/ipv6/ip6_output.c:261
   sctp_v6_xmit+0x3b7/0x470 net/sctp/ipv6.c:225
   sctp_packet_transmit+0x38cb/0x3a20 net/sctp/output.c:632
   sctp_outq_flush+0xeb3/0x46e0 net/sctp/outqueue.c:885
   sctp_outq_uncork+0xb2/0xd0 net/sctp/outqueue.c:750
   sctp_side_effects net/sctp/sm_sideeffect.c:1773
   sctp_do_sm+0x6962/0x6ec0 net/sctp/sm_sideeffect.c:1147
   sctp_primitive_ASSOCIATE+0x12c/0x160 net/sctp/primitive.c:88
   sctp_sendmsg+0x43e5/0x4f90 net/sctp/socket.c:1954
   inet_sendmsg+0x498/0x670 net/ipv4/af_inet.c:762
   sock_sendmsg_nosec net/socket.c:633
   sock_sendmsg net/socket.c:643
   SYSC_sendto+0x608/0x710 net/socket.c:1696
   SyS_sendto+0x8a/0xb0 net/socket.c:1664
   do_syscall_64+0xe6/0x130 arch/x86/entry/common.c:285
   entry_SYSCALL64_slow_path+0x25/0x25 arch/x86/entry/entry_64.S:246
  RIP: 0033:0x401133
  RSP: 002b:00007fff6d99cd38 EFLAGS: 00000246 ORIG_RAX: 000000000000002c
  RAX: ffffffffffffffda RBX: 00000000004002b0 RCX: 0000000000401133
  RDX: 0000000000000001 RSI: 0000000000494088 RDI: 0000000000000003
  RBP: 00007fff6d99cd90 R08: 00007fff6d99cd50 R09: 000000000000001c
  R10: 0000000000000001 R11: 0000000000000246 R12: 0000000000000000
  R13: 00000000004063d0 R14: 0000000000406460 R15: 0000000000000000
  origin:
   save_stack_trace+0x37/0x40 arch/x86/kernel/stacktrace.c:59
   kmsan_save_stack_with_flags mm/kmsan/kmsan.c:302
   kmsan_internal_poison_shadow+0xb1/0x1a0 mm/kmsan/kmsan.c:198
   kmsan_poison_shadow+0x6d/0xc0 mm/kmsan/kmsan.c:211
   slab_alloc_node mm/slub.c:2743
   __kmalloc_node_track_caller+0x200/0x360 mm/slub.c:4351
   __kmalloc_reserve net/core/skbuff.c:138
   __alloc_skb+0x26b/0x840 net/core/skbuff.c:231
   alloc_skb ./include/linux/skbuff.h:933
   sctp_packet_transmit+0x31e/0x3a20 net/sctp/output.c:570
   sctp_outq_flush+0xeb3/0x46e0 net/sctp/outqueue.c:885
   sctp_outq_uncork+0xb2/0xd0 net/sctp/outqueue.c:750
   sctp_side_effects net/sctp/sm_sideeffect.c:1773
   sctp_do_sm+0x6962/0x6ec0 net/sctp/sm_sideeffect.c:1147
   sctp_primitive_ASSOCIATE+0x12c/0x160 net/sctp/primitive.c:88
   sctp_sendmsg+0x43e5/0x4f90 net/sctp/socket.c:1954
   inet_sendmsg+0x498/0x670 net/ipv4/af_inet.c:762
   sock_sendmsg_nosec net/socket.c:633
   sock_sendmsg net/socket.c:643
   SYSC_sendto+0x608/0x710 net/socket.c:1696
   SyS_sendto+0x8a/0xb0 net/socket.c:1664
   do_syscall_64+0xe6/0x130 arch/x86/entry/common.c:285
   return_from_SYSCALL_64+0x0/0x6a arch/x86/entry/entry_64.S:246
  ==================================================================

Signed-off-by: Alexander Potapenko <glider@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
7 years agoIB/ipoib: Remove double pointer assigning
Leon Romanovsky [Sat, 15 Jul 2017 13:26:55 +0000 (16:26 +0300)]
IB/ipoib: Remove double pointer assigning

commit 1b355094b308f3377c8f574ce86135ee159c6285 upstream.

There is no need to assign "p" pointer twice.

This patch fixes the following smatch warning:
drivers/infiniband/ulp/ipoib/ipoib_cm.c:517 ipoib_cm_rx_handler() warn:
missing break? reassigning 'p->id'

Fixes: 839fcaba355a ("IPoIB: Connected mode experimental support")
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
7 years agoIB/ipoib: Prevent setting negative values to max_nonsrq_conn_qp
Alex Vesker [Thu, 13 Jul 2017 08:27:12 +0000 (11:27 +0300)]
IB/ipoib: Prevent setting negative values to max_nonsrq_conn_qp

commit 11f74b40359b19f760964e71d04882a6caf530cc upstream.

Don't allow negative values to max_nonsrq_conn_qp. There is no functional
impact on a negative value but it is logicically incorrect.

Fixes: 68e995a29572 ("IPoIB/cm: Add connected mode support for devices without SRQs")
Signed-off-by: Alex Vesker <valex@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
7 years agoperf/core: Fix locking for children siblings group read
Jiri Olsa [Thu, 20 Jul 2017 14:14:55 +0000 (16:14 +0200)]
perf/core: Fix locking for children siblings group read

commit 2aeb1883547626d82c597cce2c99f0b9c62e2425 upstream.

We're missing ctx lock when iterating children siblings
within the perf_read path for group reading. Following
race and crash can happen:

User space doing read syscall on event group leader:

T1:
  perf_read
    lock event->ctx->mutex
    perf_read_group
      lock leader->child_mutex
      __perf_read_group_add(child)
        list_for_each_entry(sub, &leader->sibling_list, group_entry)

---->   sub might be invalid at this point, because it could
        get removed via perf_event_exit_task_context in T2

Child exiting and cleaning up its events:

T2:
  perf_event_exit_task_context
    lock ctx->mutex
    list_for_each_entry_safe(child_event, next, &child_ctx->event_list,...
      perf_event_exit_event(child)
        lock ctx->lock
        perf_group_detach(child)
        unlock ctx->lock

---->   child is removed from sibling_list without any sync
        with T1 path above

        ...
        free_event(child)

Before the child is removed from the leader's child_list,
(and thus is omitted from perf_read_group processing), we
need to ensure that perf_read_group touches child's
siblings under its ctx->lock.

Peter further notes:

| One additional note; this bug got exposed by commit:
|
|   ba5213ae6b88 ("perf/core: Correct event creation with PERF_FORMAT_GROUP")
|
| which made it possible to actually trigger this code-path.

Tested-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: ba5213ae6b88 ("perf/core: Correct event creation with PERF_FORMAT_GROUP")
Link: http://lkml.kernel.org/r/20170720141455.2106-1-jolsa@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
7 years agoperf/core: Invert perf_read_group() loops
Peter Zijlstra [Fri, 4 Sep 2015 03:07:49 +0000 (20:07 -0700)]
perf/core: Invert perf_read_group() loops

commit fa8c269353d560b7c28119ad7617029f92e40b15 upstream.

In order to enable the use of perf_event_read(.group = true), we need
to invert the sibling-child loop nesting of perf_read_group().

Currently we iterate the child list for each sibling, this precludes
using group reads. Flip things around so we iterate each group for
each child.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
[ Made the patch compile and things. ]
Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Link: http://lkml.kernel.org/r/1441336073-22750-7-git-send-email-sukadev@linux.vnet.ibm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
[bwh: Backported to 3.2 as a dependency of commit 2aeb18835476 ("perf/core: Fix
 locking for children siblings group read"):
 - Keep the function name perf_event_read_group()
 - Keep using perf_event_read_value()]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
7 years agoipv4: initialize fib_trie prior to register_netdev_notifier call.
Mahesh Bandewar [Wed, 19 Jul 2017 22:41:33 +0000 (15:41 -0700)]
ipv4: initialize fib_trie prior to register_netdev_notifier call.

commit 8799a221f5944a7d74516ecf46d58c28ec1d1f75 upstream.

Net stack initialization currently initializes fib-trie after the
first call to netdevice_notifier() call. In fact fib_trie initialization
needs to happen before first rtnl_register(). It does not cause any problem
since there are no devices UP at this moment, but trying to bring 'lo'
UP at initialization would make this assumption wrong and exposes the issue.

Fixes following crash

 Call Trace:
  ? alternate_node_alloc+0x76/0xa0
  fib_table_insert+0x1b7/0x4b0
  fib_magic.isra.17+0xea/0x120
  fib_add_ifaddr+0x7b/0x190
  fib_netdev_event+0xc0/0x130
  register_netdevice_notifier+0x1c1/0x1d0
  ip_fib_init+0x72/0x85
  ip_rt_init+0x187/0x1e9
  ip_init+0xe/0x1a
  inet_init+0x171/0x26c
  ? ipv4_offload_init+0x66/0x66
  do_one_initcall+0x43/0x160
  kernel_init_freeable+0x191/0x219
  ? rest_init+0x80/0x80
  kernel_init+0xe/0x150
  ret_from_fork+0x22/0x30
 Code: f6 46 23 04 74 86 4c 89 f7 e8 ae 45 01 00 49 89 c7 4d 85 ff 0f 85 7b ff ff ff 31 db eb 08 4c 89 ff e8 16 47 01 00 48 8b 44 24 38 <45> 8b 6e 14 4d 63 76 74 48 89 04 24 0f 1f 44 00 00 48 83 c4 08
 RIP: kmem_cache_alloc+0xcf/0x1c0 RSP: ffff9b1500017c28
 CR2: 0000000000000014

Fixes: 7b1a74fdbb9e ("[NETNS]: Refactor fib initialization so it can handle multiple namespaces.")
Fixes: 7f9b80529b8a ("[IPV4]: fib hash|trie initialization")

Signed-off-by: Mahesh Bandewar <maheshb@google.com>
Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
7 years agoRDMA/core: Initialize port_num in qp_attr
Ismail, Mustafa [Fri, 14 Jul 2017 14:41:31 +0000 (09:41 -0500)]
RDMA/core: Initialize port_num in qp_attr

commit a62ab66b13a0f9bcb17b7b761f6670941ed5cd62 upstream.

Initialize the port_num for iWARP in rdma_init_qp_attr.

Fixes: 5ecce4c9b17b("Check port number supplied by user verbs cmds")
Reviewed-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com>
Tested-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
7 years agoRDMA/uverbs: Fix the check for port number
Ismail, Mustafa [Fri, 14 Jul 2017 14:41:30 +0000 (09:41 -0500)]
RDMA/uverbs: Fix the check for port number

commit 5a7a88f1b488e4ee49eb3d5b82612d4d9ffdf2c3 upstream.

The port number is only valid if IB_QP_PORT is set in the mask.
So only check port number if it is valid to prevent modify_qp from
failing due to an invalid port number.

Fixes: 5ecce4c9b17b("Check port number supplied by user verbs cmds")
Reviewed-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com>
Tested-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
[bwh: Backported to 3.2: command structure is cmd not cmd->base]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
7 years agoIB/cxgb3: Fix error codes in iwch_alloc_mr()
Dan Carpenter [Thu, 13 Jul 2017 07:48:00 +0000 (10:48 +0300)]
IB/cxgb3: Fix error codes in iwch_alloc_mr()

commit 9064d6055c14f700aa13f7c72fd3e63d12bee643 upstream.

We accidentally don't set the error code on some error paths.  It means
return ERR_PTR(0) which is NULL and results in a NULL dereference in the
caller.

Fixes: 13a239330abd ("RDMA/cxgb3: Don't ignore insert_handle() failures")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
[bwh: Backported to 3.2: drop inapplicable hunk]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
7 years agocxgb4: Fix error codes in c4iw_create_cq()
Dan Carpenter [Thu, 13 Jul 2017 07:47:40 +0000 (10:47 +0300)]
cxgb4: Fix error codes in c4iw_create_cq()

commit 6ebedacbb44602d4dec3348dee5ec31dd9b09521 upstream.

If one of these kmalloc() calls fails then we return ERR_PTR(0) which is
NULL.  It results in a NULL dereference in the callers.

Fixes: cfdda9d76436 ("RDMA/cxgb4: Add driver for Chelsio T4 RNIC")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
7 years agox86/acpi: Prevent out of bound access caused by broken ACPI tables
Seunghun Han [Tue, 18 Jul 2017 11:03:51 +0000 (20:03 +0900)]
x86/acpi: Prevent out of bound access caused by broken ACPI tables

commit dad5ab0db8deac535d03e3fe3d8f2892173fa6a4 upstream.

The bus_irq argument of mp_override_legacy_irq() is used as the index into
the isa_irq_to_gsi[] array. The bus_irq argument originates from
ACPI_MADT_TYPE_IO_APIC and ACPI_MADT_TYPE_INTERRUPT items in the ACPI
tables, but is nowhere sanity checked.

That allows broken or malicious ACPI tables to overwrite memory, which
might cause malfunction, panic or arbitrary code execution.

Add a sanity check and emit a warning when that triggers.

[ tglx: Added warning and rewrote changelog ]

Signed-off-by: Seunghun Han <kkamagui@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: security@kernel.org
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
7 years agomount: copy the port field into the cloned nfs_server structure.
Steve Dickson [Thu, 29 Jun 2017 15:48:26 +0000 (11:48 -0400)]
mount: copy the port field into the cloned nfs_server structure.

commit 89a6814d9b665b196aa3a102f96b6dc7e8cb669e upstream.

Doing this copy eliminates the "port=0" entry in
the /proc/mounts entries

Fixes: https://bugzilla.kernel.org/show_bug.cgi?id=69241

Signed-off-by: Steve Dickson <steved@redhat.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
7 years agolibata: array underflow in ata_find_dev()
Dan Carpenter [Wed, 19 Jul 2017 10:06:41 +0000 (13:06 +0300)]
libata: array underflow in ata_find_dev()

commit 59a5e266c3f5c1567508888dd61a45b86daed0fa upstream.

My static checker complains that "devno" can be negative, meaning that
we read before the start of the loop.  I've looked at the code, and I
think the warning is right.  This come from /proc so it's root only or
it would be quite a quite a serious bug.  The call tree looks like this:

proc_scsi_write() <- gets id and channel from simple_strtoul()
-> scsi_add_single_device() <- calls shost->transportt->user_scan()
   -> ata_scsi_user_scan()
      -> ata_find_dev()

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
7 years agousb: renesas_usbhs: fix usbhsc_resume() for !USBHSF_RUNTIME_PWCTRL
Yoshihiro Shimoda [Wed, 19 Jul 2017 07:16:54 +0000 (16:16 +0900)]
usb: renesas_usbhs: fix usbhsc_resume() for !USBHSF_RUNTIME_PWCTRL

commit 59a0879a0e17b2e43ecdc5e3299da85b8410d7ce upstream.

This patch fixes an issue that some registers may be not initialized
after resume if the USBHSF_RUNTIME_PWCTRL is not set. Otherwise,
if a cable is not connected, the driver will not enable INTENB0.VBSE
after resume. And then, the driver cannot detect the VBUS.

Fixes: ca8a282a5373 ("usb: gadget: renesas_usbhs: add suspend/resume support")
Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
7 years agousb: renesas_usbhs: fixup resume method for autonomy mode
Kuninori Morimoto [Mon, 6 Aug 2012 05:44:43 +0000 (22:44 -0700)]
usb: renesas_usbhs: fixup resume method for autonomy mode

commit 5b50d3b52601651ef3183cfb33d03cf486180e48 upstream.

If renesas_usbhs is probed as autonomy mode,
phy reset should be called after power resumed,
and manual cold-plug should be called with slight delay.

Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
7 years agousb: storage: return on error to avoid a null pointer dereference
Colin Ian King [Thu, 6 Jul 2017 15:06:32 +0000 (16:06 +0100)]
usb: storage: return on error to avoid a null pointer dereference

commit 446230f52a5bef593554510302465eabab45a372 upstream.

When us->extra is null the driver is not initialized, however, a
later call to osd200_scsi_to_ata is made that dereferences
us->extra, causing a null pointer dereference.  The code
currently detects and reports that the driver is not initialized;
add a return to avoid the subsequent dereference issue in this
check.

Thanks to Alan Stern for pointing out that srb->result needs setting
to DID_ERROR << 16

Detected by CoverityScan, CID#100308 ("Dereference after null check")

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
7 years agoUSB: cdc-acm: add device-id for quirky printer
Johan Hovold [Wed, 12 Jul 2017 13:08:39 +0000 (15:08 +0200)]
USB: cdc-acm: add device-id for quirky printer

commit fe855789d605590e57f9cd968d85ecce46f5c3fd upstream.

Add device-id entry for DATECS FP-2000 fiscal printer needing the
NO_UNION_NORMAL quirk.

Reported-by: Anton Avramov <lukav@lukav.com>
Signed-off-by: Johan Hovold <johan@kernel.org>
Acked-by: Oliver Neukum <oneukum@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
7 years agoUSB: serial: cp210x: add support for Qivicon USB ZigBee dongle
Stefan Triller [Fri, 30 Jun 2017 12:44:03 +0000 (14:44 +0200)]
USB: serial: cp210x: add support for Qivicon USB ZigBee dongle

commit 9585e340db9f6cc1c0928d82c3a23cc4460f0a3f upstream.

The German Telekom offers a ZigBee USB Stick under the brand name Qivicon
for their SmartHome Home Base in its 1. Generation. The productId is not
known by the according kernel module, this patch adds support for it.

Signed-off-by: Stefan Triller <github@stefantriller.de>
Reviewed-by: Frans Klaver <fransklaver@gmail.com>
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
7 years agostaging:iio:resolver:ad2s1210 fix negative IIO_ANGL_VEL read
Arnd Bergmann [Fri, 14 Jul 2017 09:31:03 +0000 (11:31 +0200)]
staging:iio:resolver:ad2s1210 fix negative IIO_ANGL_VEL read

commit 105967ad68d2eb1a041bc041f9cf96af2a653b65 upstream.

gcc-7 points out an older regression:

drivers/staging/iio/resolver/ad2s1210.c: In function 'ad2s1210_read_raw':
drivers/staging/iio/resolver/ad2s1210.c:515:42: error: '<<' in boolean context, did you mean '<' ? [-Werror=int-in-bool-context]

The original code had 'unsigned short' here, but incorrectly got
converted to 'bool'. This reverts the regression and uses a normal
type instead.

Fixes: 29148543c521 ("staging:iio:resolver:ad2s1210 minimal chan spec conversion.")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
7 years agoiio: light: tsl2563: use correct event code
Akinobu Mita [Tue, 20 Jun 2017 16:46:37 +0000 (01:46 +0900)]
iio: light: tsl2563: use correct event code

commit a3507e48d3f99a93a3056a34a5365f310434570f upstream.

The TSL2563 driver provides three iio channels, two of which are raw ADC
channels (channel 0 and channel 1) in the device and the remaining one
is calculated by the two.  The ADC channel 0 only supports programmable
interrupt with threshold settings and this driver supports the event but
the generated event code does not contain the corresponding iio channel
type.

This is going to change userspace ABI.  Hopefully fixing this to be
what it should always have been won't break any userspace code.

Cc: Jonathan Cameron <jic23@kernel.org>
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
[bwh: Backported to 3.2: adjust filename]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
7 years agofuse: initialize the flock flag in fuse_file on allocation
Mateusz Jurczyk [Wed, 7 Jun 2017 10:26:49 +0000 (12:26 +0200)]
fuse: initialize the flock flag in fuse_file on allocation

commit 68227c03cba84a24faf8a7277d2b1a03c8959c2c upstream.

Before the patch, the flock flag could remain uninitialized for the
lifespan of the fuse_file allocation. Unless set to true in
fuse_file_flock(), it would remain in an indeterminate state until read in
an if statement in fuse_release_common(). This could consequently lead to
taking an unexpected branch in the code.

The bug was discovered by a runtime instrumentation designed to detect use
of uninitialized memory in the kernel.

Signed-off-by: Mateusz Jurczyk <mjurczyk@google.com>
Fixes: 37fb3a30b462 ("fuse: fix flock")
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
8 years agoLinux 3.2.94
Ben Hutchings [Thu, 12 Oct 2017 14:27:23 +0000 (15:27 +0100)]
Linux 3.2.94

8 years agocpuset: PF_SPREAD_PAGE and PF_SPREAD_SLAB should be atomic flags
Zefan Li [Thu, 25 Sep 2014 01:41:02 +0000 (09:41 +0800)]
cpuset: PF_SPREAD_PAGE and PF_SPREAD_SLAB should be atomic flags

commit 2ad654bc5e2b211e92f66da1d819e47d79a866f0 upstream.

When we change cpuset.memory_spread_{page,slab}, cpuset will flip
PF_SPREAD_{PAGE,SLAB} bit of tsk->flags for each task in that cpuset.
This should be done using atomic bitops, but currently we don't,
which is broken.

Tetsuo reported a hard-to-reproduce kernel crash on RHEL6, which happened
when one thread tried to clear PF_USED_MATH while at the same time another
thread tried to flip PF_SPREAD_PAGE/PF_SPREAD_SLAB. They both operate on
the same task.

Here's the full report:
https://lkml.org/lkml/2014/9/19/230

To fix this, we make PF_SPREAD_PAGE and PF_SPREAD_SLAB atomic flags.

v4:
- updated mm/slab.c. (Fengguang Wu)
- updated Documentation.

Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Miao Xie <miaox@cn.fujitsu.com>
Cc: Kees Cook <keescook@chromium.org>
Fixes: 950592f7b991 ("cpusets: update tasks' page/slab spread flags in time")
Reported-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: Zefan Li <lizefan@huawei.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
[lizf: Backported to 3.4:
 - adjust context
 - check current->flags & PF_MEMPOLICY rather than current->mempolicy]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
8 years agosched: add macros to define bitops for task atomic flags
Zefan Li [Thu, 25 Sep 2014 01:40:40 +0000 (09:40 +0800)]
sched: add macros to define bitops for task atomic flags

commit e0e5070b20e01f0321f97db4e4e174f3f6b49e50 upstream.

This will simplify code when we add new flags.

v3:
- Kees pointed out that no_new_privs should never be cleared, so we
shouldn't define task_clear_no_new_privs(). we define 3 macros instead
of a single one.

v2:
- updated scripts/tags.sh, suggested by Peter

Cc: Ingo Molnar <mingo@kernel.org>
Cc: Miao Xie <miaox@cn.fujitsu.com>
Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Zefan Li <lizefan@huawei.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
[lizf: Backported to 3.4:
 - adjust context
 - remove no_new_priv code
 - add atomic_flags to struct task_struct]
[bwh: Backported to 3.2:
 - Drop changes in scripts/tags.sh
 - Adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
8 years agonet sched filters: fix notification of filter delete with proper handle
Jamal Hadi Salim [Tue, 25 Oct 2016 00:18:27 +0000 (20:18 -0400)]
net sched filters: fix notification of filter delete with proper handle

[ Upstream commit 9ee7837449b3d6f0fcf9132c6b5e5aaa58cc67d4 ]

Daniel says:

While trying out [1][2], I noticed that tc monitor doesn't show the
correct handle on delete:

$ tc monitor
qdisc clsact ffff: dev eno1 parent ffff:fff1
filter dev eno1 ingress protocol all pref 49152 bpf handle 0x2a [...]
deleted filter dev eno1 ingress protocol all pref 49152 bpf handle 0xf3be0c80

some context to explain the above:
The user identity of any tc filter is represented by a 32-bit
identifier encoded in tcm->tcm_handle. Example 0x2a in the bpf filter
above. A user wishing to delete, get or even modify a specific filter
uses this handle to reference it.
Every classifier is free to provide its own semantics for the 32 bit handle.
Example: classifiers like u32 use schemes like 800:1:801 to describe
the semantics of their filters represented as hash table, bucket and
node ids etc.
Classifiers also have internal per-filter representation which is different
from this externally visible identity. Most classifiers set this
internal representation to be a pointer address (which allows fast retrieval
of said filters in their implementations). This internal representation
is referenced with the "fh" variable in the kernel control code.

When a user successfuly deletes a specific filter, by specifying the correct
tcm->tcm_handle, an event is generated to user space which indicates
which specific filter was deleted.

Before this patch, the "fh" value was sent to user space as the identity.
As an example what is shown in the sample bpf filter delete event above
is 0xf3be0c80. This is infact a 32-bit truncation of 0xffff8807f3be0c80
which happens to be a 64-bit memory address of the internal filter
representation (address of the corresponding filter's struct cls_bpf_prog);

After this patch the appropriate user identifiable handle as encoded
in the originating request tcm->tcm_handle is generated in the event.
One of the cardinal rules of netlink rules is to be able to take an
event (such as a delete in this case) and reflect it back to the
kernel and successfully delete the filter. This patch achieves that.

Note, this issue has existed since the original TC action
infrastructure code patch back in 2004 as found in:
https://git.kernel.org/cgit/linux/kernel/git/history/history.git/commit/

[1] http://patchwork.ozlabs.org/patch/682828/
[2] http://patchwork.ozlabs.org/patch/682829/

Fixes: 4e54c4816bfe ("[NET]: Add tc extensions infrastructure.")
Reported-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
8 years agom32r: add io*_rep helpers
Sudip Mukherjee [Tue, 29 Dec 2015 22:54:19 +0000 (14:54 -0800)]
m32r: add io*_rep helpers

commit 92a8ed4c7643809123ef0a65424569eaacc5c6b0 upstream.

m32r allmodconfig was failing with the error:

  error: implicit declaration of function 'read'

On checking io.h it turned out that 'read' is not defined but 'readb' is
defined and 'ioread8' will then obviously mean 'readb'.

At the same time some of the helper functions ioreadN_rep() and
iowriteN_rep() were missing which also led to the build failure.

Signed-off-by: Sudip Mukherjee <sudip@vectorindia.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
8 years agom32r: add definition of ioremap_wc to io.h
Abhilash Kesavan [Fri, 6 Feb 2015 13:45:26 +0000 (19:15 +0530)]
m32r: add definition of ioremap_wc to io.h

commit 71a49d16f06de2ccdf52ca247d496a2bb1ca36fe upstream.

Before adding a resource managed ioremap_wc function, we need
to have ioremap_wc defined for m32r to prevent build errors.

Signed-off-by: Abhilash Kesavan <a.kesavan@samsung.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Sudip Mukherjee <sudipm.mukherjee@gmail.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
8 years agoperf/x86: Check if user fp is valid
Arun Sharma [Fri, 20 Apr 2012 22:41:35 +0000 (15:41 -0700)]
perf/x86: Check if user fp is valid

commit bc6ca7b342d5ae15c3ba3081fd40271b8039fb25 upstream.

Signed-off-by: Arun Sharma <asharma@fb.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1334961696-19580-4-git-send-email-asharma@fb.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
[bwh: Backported to 3.2: also add user_addr_max() macro]
Cc: Jann Horn <jannh@google.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
8 years agoMIPS: Refactor 'clear_page' and 'copy_page' functions.
Steven J. Hill [Fri, 6 Jul 2012 19:56:01 +0000 (21:56 +0200)]
MIPS: Refactor 'clear_page' and 'copy_page' functions.

commit c022630633624a75b3b58f43dd3c6cc896a56cff upstream.

Remove usage of the '__attribute__((alias("...")))' hack that aliased
to integer arrays containing micro-assembled instructions. This hack
breaks when building a microMIPS kernel. It also makes the code much
easier to understand.

[ralf@linux-mips.org: Added back export of the clear_page and copy_page
symbols so certain modules will work again.  Also fixed build with
CONFIG_SIBYTE_DMA_PAGEOPS enabled.]

Signed-off-by: Steven J. Hill <sjhill@mips.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/3866/
Acked-by: David Daney <david.daney@cavium.com>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
[bwh: Backported to 3.2: adjust context]
Cc: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
8 years agonetfilter: nf_conntrack: fix RCU race in nf_conntrack_find_get
Andrey Vagin [Wed, 29 Jan 2014 18:34:14 +0000 (19:34 +0100)]
netfilter: nf_conntrack: fix RCU race in nf_conntrack_find_get

commit c6825c0976fa7893692e0e43b09740b419b23c09 upstream.

Lets look at destroy_conntrack:

hlist_nulls_del_rcu(&ct->tuplehash[IP_CT_DIR_ORIGINAL].hnnode);
...
nf_conntrack_free(ct)
kmem_cache_free(net->ct.nf_conntrack_cachep, ct);

net->ct.nf_conntrack_cachep is created with SLAB_DESTROY_BY_RCU.

The hash is protected by rcu, so readers look up conntracks without
locks.
A conntrack is removed from the hash, but in this moment a few readers
still can use the conntrack. Then this conntrack is released and another
thread creates conntrack with the same address and the equal tuple.
After this a reader starts to validate the conntrack:
* It's not dying, because a new conntrack was created
* nf_ct_tuple_equal() returns true.

But this conntrack is not initialized yet, so it can not be used by two
threads concurrently. In this case BUG_ON may be triggered from
nf_nat_setup_info().

Florian Westphal suggested to check the confirm bit too. I think it's
right.

task 1 task 2 task 3
nf_conntrack_find_get
 ____nf_conntrack_find
destroy_conntrack
 hlist_nulls_del_rcu
 nf_conntrack_free
 kmem_cache_free
__nf_conntrack_alloc
 kmem_cache_alloc
 memset(&ct->tuplehash[IP_CT_DIR_MAX],
 if (nf_ct_is_dying(ct))
 if (!nf_ct_tuple_equal()

I'm not sure, that I have ever seen this race condition in a real life.
Currently we are investigating a bug, which is reproduced on a few nodes.
In our case one conntrack is initialized from a few tasks concurrently,
we don't have any other explanation for this.

<2>[46267.083061] kernel BUG at net/ipv4/netfilter/nf_nat_core.c:322!
...
<4>[46267.083951] RIP: 0010:[<ffffffffa01e00a4>]  [<ffffffffa01e00a4>] nf_nat_setup_info+0x564/0x590 [nf_nat]
...
<4>[46267.085549] Call Trace:
<4>[46267.085622]  [<ffffffffa023421b>] alloc_null_binding+0x5b/0xa0 [iptable_nat]
<4>[46267.085697]  [<ffffffffa02342bc>] nf_nat_rule_find+0x5c/0x80 [iptable_nat]
<4>[46267.085770]  [<ffffffffa0234521>] nf_nat_fn+0x111/0x260 [iptable_nat]
<4>[46267.085843]  [<ffffffffa0234798>] nf_nat_out+0x48/0xd0 [iptable_nat]
<4>[46267.085919]  [<ffffffff814841b9>] nf_iterate+0x69/0xb0
<4>[46267.085991]  [<ffffffff81494e70>] ? ip_finish_output+0x0/0x2f0
<4>[46267.086063]  [<ffffffff81484374>] nf_hook_slow+0x74/0x110
<4>[46267.086133]  [<ffffffff81494e70>] ? ip_finish_output+0x0/0x2f0
<4>[46267.086207]  [<ffffffff814b5890>] ? dst_output+0x0/0x20
<4>[46267.086277]  [<ffffffff81495204>] ip_output+0xa4/0xc0
<4>[46267.086346]  [<ffffffff814b65a4>] raw_sendmsg+0x8b4/0x910
<4>[46267.086419]  [<ffffffff814c10fa>] inet_sendmsg+0x4a/0xb0
<4>[46267.086491]  [<ffffffff814459aa>] ? sock_update_classid+0x3a/0x50
<4>[46267.086562]  [<ffffffff81444d67>] sock_sendmsg+0x117/0x140
<4>[46267.086638]  [<ffffffff8151997b>] ? _spin_unlock_bh+0x1b/0x20
<4>[46267.086712]  [<ffffffff8109d370>] ? autoremove_wake_function+0x0/0x40
<4>[46267.086785]  [<ffffffff81495e80>] ? do_ip_setsockopt+0x90/0xd80
<4>[46267.086858]  [<ffffffff8100be0e>] ? call_function_interrupt+0xe/0x20
<4>[46267.086936]  [<ffffffff8118cb10>] ? ub_slab_ptr+0x20/0x90
<4>[46267.087006]  [<ffffffff8118cb10>] ? ub_slab_ptr+0x20/0x90
<4>[46267.087081]  [<ffffffff8118f2e8>] ? kmem_cache_alloc+0xd8/0x1e0
<4>[46267.087151]  [<ffffffff81445599>] sys_sendto+0x139/0x190
<4>[46267.087229]  [<ffffffff81448c0d>] ? sock_setsockopt+0x16d/0x6f0
<4>[46267.087303]  [<ffffffff810efa47>] ? audit_syscall_entry+0x1d7/0x200
<4>[46267.087378]  [<ffffffff810ef795>] ? __audit_syscall_exit+0x265/0x290
<4>[46267.087454]  [<ffffffff81474885>] ? compat_sys_setsockopt+0x75/0x210
<4>[46267.087531]  [<ffffffff81474b5f>] compat_sys_socketcall+0x13f/0x210
<4>[46267.087607]  [<ffffffff8104dea3>] ia32_sysret+0x0/0x5
<4>[46267.087676] Code: 91 20 e2 01 75 29 48 89 de 4c 89 f7 e8 56 fa ff ff 85 c0 0f 84 68 fc ff ff 0f b6 4d c6 41 8b 45 00 e9 4d fb ff ff e8 7c 19 e9 e0 <0f> 0b eb fe f6 05 17 91 20 e2 80 74 ce 80 3d 5f 2e 00 00 00 74
<1>[46267.088023] RIP  [<ffffffffa01e00a4>] nf_nat_setup_info+0x564/0x590

Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Florian Westphal <fw@strlen.de>
Cc: Pablo Neira Ayuso <pablo@netfilter.org>
Cc: Patrick McHardy <kaber@trash.net>
Cc: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
8 years agol2tp: avoid use-after-free caused by l2tp_ip_backlog_recv
Paul Hüber [Sun, 26 Feb 2017 16:58:19 +0000 (17:58 +0100)]
l2tp: avoid use-after-free caused by l2tp_ip_backlog_recv

commit 51fb60eb162ab84c5edf2ae9c63cf0b878e5547e upstream.

l2tp_ip_backlog_recv may not return -1 if the packet gets dropped.
The return value is passed up to ip_local_deliver_finish, which treats
negative values as an IP protocol number for resubmission.

Signed-off-by: Paul Hüber <phueber@kernsp.in>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
8 years agoBluetooth: Properly check L2CAP config option output buffer length
Ben Seri [Sat, 9 Sep 2017 21:15:59 +0000 (23:15 +0200)]
Bluetooth: Properly check L2CAP config option output buffer length

commit e860d2c904d1a9f38a24eb44c9f34b8f915a6ea3 upstream.

Validate the output buffer length for L2CAP config requests and responses
to avoid overflowing the stack buffer used for building the option blocks.

Signed-off-by: Ben Seri <ben@armis.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
[bwh: Backported to 3.2:
 - Drop changes to handling of L2CAP_CONF_EFS, L2CAP_CONF_EWS
 - Drop changes to l2cap_do_create(), l2cap_security_cfm(), and L2CAP_CONF_PENDING
   case in l2cap_config_rsp()
 - In l2cap_config_rsp(), s/buf/req/
 - Adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
8 years agoscsi: scsi_transport_iscsi: fix the issue that iscsi_if_rx doesn't parse nlmsg properly
Xin Long [Sun, 27 Aug 2017 12:25:26 +0000 (20:25 +0800)]
scsi: scsi_transport_iscsi: fix the issue that iscsi_if_rx doesn't parse nlmsg properly

commit c88f0e6b06f4092995688211a631bb436125d77b upstream.

ChunYu found a kernel crash by syzkaller:

[  651.617875] kasan: CONFIG_KASAN_INLINE enabled
[  651.618217] kasan: GPF could be caused by NULL-ptr deref or user memory access
[  651.618731] general protection fault: 0000 [#1] SMP KASAN
[  651.621543] CPU: 1 PID: 9539 Comm: scsi Not tainted 4.11.0.cov #32
[  651.621938] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
[  651.622309] task: ffff880117780000 task.stack: ffff8800a3188000
[  651.622762] RIP: 0010:skb_release_data+0x26c/0x590
[...]
[  651.627260] Call Trace:
[  651.629156]  skb_release_all+0x4f/0x60
[  651.629450]  consume_skb+0x1a5/0x600
[  651.630705]  netlink_unicast+0x505/0x720
[  651.632345]  netlink_sendmsg+0xab2/0xe70
[  651.633704]  sock_sendmsg+0xcf/0x110
[  651.633942]  ___sys_sendmsg+0x833/0x980
[  651.637117]  __sys_sendmsg+0xf3/0x240
[  651.638820]  SyS_sendmsg+0x32/0x50
[  651.639048]  entry_SYSCALL_64_fastpath+0x1f/0xc2

It's caused by skb_shared_info at the end of sk_buff was overwritten by
ISCSI_KEVENT_IF_ERROR when parsing nlmsg info from skb in iscsi_if_rx.

During the loop if skb->len == nlh->nlmsg_len and both are sizeof(*nlh),
ev = nlmsg_data(nlh) will acutally get skb_shinfo(SKB) instead and set a
new value to skb_shinfo(SKB)->nr_frags by ev->type.

This patch is to fix it by checking nlh->nlmsg_len properly there to
avoid over accessing sk_buff.

Reported-by: ChunYu Wang <chunwang@redhat.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Chris Leech <cleech@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
8 years agoxfs: XFS_IS_REALTIME_INODE() should be false if no rt device present
Richard Wareing [Tue, 12 Sep 2017 23:09:35 +0000 (09:09 +1000)]
xfs: XFS_IS_REALTIME_INODE() should be false if no rt device present

commit b31ff3cdf540110da4572e3e29bd172087af65cc upstream.

If using a kernel with CONFIG_XFS_RT=y and we set the RHINHERIT flag on
a directory in a filesystem that does not have a realtime device and
create a new file in that directory, it gets marked as a real time file.
When data is written and a fsync is issued, the filesystem attempts to
flush a non-existent rt device during the fsync process.

This results in a crash dereferencing a null buftarg pointer in
xfs_blkdev_issue_flush():

  BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
  IP: xfs_blkdev_issue_flush+0xd/0x20
  .....
  Call Trace:
    xfs_file_fsync+0x188/0x1c0
    vfs_fsync_range+0x3b/0xa0
    do_fsync+0x3d/0x70
    SyS_fsync+0x10/0x20
    do_syscall_64+0x4d/0xb0
    entry_SYSCALL64_slow_path+0x25/0x25

Setting RT inode flags does not require special privileges so any
unprivileged user can cause this oops to occur.  To reproduce, confirm
kernel is compiled with CONFIG_XFS_RT=y and run:

  # mkfs.xfs -f /dev/pmem0
  # mount /dev/pmem0 /mnt/test
  # mkdir /mnt/test/foo
  # xfs_io -c 'chattr +t' /mnt/test/foo
  # xfs_io -f -c 'pwrite 0 5m' -c fsync /mnt/test/foo/bar

Or just run xfstests with MKFS_OPTIONS="-d rtinherit=1" and wait.

Kernels built with CONFIG_XFS_RT=n are not exposed to this bug.

Fixes: f538d4da8d52 ("[XFS] write barrier support")
Signed-off-by: Richard Wareing <rwareing@fb.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
[bwh: Backported to 3.2: adjust filename]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
8 years agovideo: fbdev: aty: do not leak uninitialized padding in clk to userspace
Vladis Dronov [Mon, 4 Sep 2017 14:00:50 +0000 (16:00 +0200)]
video: fbdev: aty: do not leak uninitialized padding in clk to userspace

commit 8e75f7a7a00461ef6d91797a60b606367f6e344d upstream.

'clk' is copied to a userland with padding byte(s) after 'vclk_post_div'
field unitialized, leaking data from the stack. Fix this ensuring all of
'clk' is initialized to zero.

References: https://github.com/torvalds/linux/pull/441
Reported-by: sohu0106 <sohu0106@126.com>
Signed-off-by: Vladis Dronov <vdronov@redhat.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
[bwh: Backported to 3.2: adjust filename]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
8 years agokvm: nVMX: Don't allow L2 to access the hardware CR8
Jim Mattson [Tue, 12 Sep 2017 20:02:54 +0000 (13:02 -0700)]
kvm: nVMX: Don't allow L2 to access the hardware CR8

commit 51aa68e7d57e3217192d88ce90fd5b8ef29ec94f upstream.

If L1 does not specify the "use TPR shadow" VM-execution control in
vmcs12, then L0 must specify the "CR8-load exiting" and "CR8-store
exiting" VM-execution controls in vmcs02. Failure to do so will give
the L2 VM unrestricted read/write access to the hardware CR8.

This fixes CVE-2017-12154.

Signed-off-by: Jim Mattson <jmattson@google.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
8 years agonl80211: check for the required netlink attributes presence
Vladis Dronov [Tue, 12 Sep 2017 22:21:21 +0000 (00:21 +0200)]
nl80211: check for the required netlink attributes presence

commit e785fa0a164aa11001cba931367c7f94ffaff888 upstream.

nl80211_set_rekey_data() does not check if the required attributes
NL80211_REKEY_DATA_{REPLAY_CTR,KEK,KCK} are present when processing
NL80211_CMD_SET_REKEY_OFFLOAD request. This request can be issued by
users with CAP_NET_ADMIN privilege and may result in NULL dereference
and a system crash. Add a check for the required attributes presence.
This patch is based on the patch by bo Zhang.

This fixes CVE-2017-12153.

References: https://bugzilla.redhat.com/show_bug.cgi?id=1491046
Fixes: e5497d766ad ("cfg80211/nl80211: support GTK rekey offload")
Reported-by: bo Zhang <zhangbo5891001@gmail.com>
Signed-off-by: Vladis Dronov <vdronov@redhat.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
8 years agosaa7164: fix double fetch PCIe access condition
Steven Toth [Tue, 6 Jun 2017 12:30:27 +0000 (09:30 -0300)]
saa7164: fix double fetch PCIe access condition

commit 6fb05e0dd32e566facb96ea61a48c7488daa5ac3 upstream.

Avoid a double fetch by reusing the values from the prior transfer.

Originally reported via https://bugzilla.kernel.org/show_bug.cgi?id=195559

Thanks to Pengfei Wang <wpengfeinudt@gmail.com> for reporting.

Signed-off-by: Steven Toth <stoth@kernellabs.com>
Reported-by: Pengfei Wang <wpengfeinudt@gmail.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
[bwh: Backported to 3.2: adjust filename]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
8 years agosaa7164: fix sparse warnings
Hans Verkuil [Fri, 7 Nov 2014 14:39:46 +0000 (11:39 -0300)]
saa7164: fix sparse warnings

commit 065e1477d277174242e73e7334c717b840d0693f upstream.

Fix many sparse warnings:

drivers/media/pci/saa7164/saa7164-core.c:97:18: warning: cast removes address space of expression
drivers/media/pci/saa7164/saa7164-core.c:122:31: warning: cast removes address space of expression
drivers/media/pci/saa7164/saa7164-core.c:122:31: warning: incorrect type in initializer (different address spaces)
drivers/media/pci/saa7164/saa7164-core.c:122:31:    expected unsigned char [noderef] [usertype] <asn:2>*bufcpu
drivers/media/pci/saa7164/saa7164-core.c:122:31:    got unsigned char [usertype] *<noident>
drivers/media/pci/saa7164/saa7164-core.c:282:44: warning: cast removes address space of expression
drivers/media/pci/saa7164/saa7164-core.c:286:38: warning: cast removes address space of expression
drivers/media/pci/saa7164/saa7164-core.c:286:35: warning: incorrect type in assignment (different address spaces)
drivers/media/pci/saa7164/saa7164-core.c:286:35:    expected unsigned char [noderef] [usertype] <asn:2>*p
drivers/media/pci/saa7164/saa7164-core.c:286:35:    got unsigned char [usertype] *<noident>
drivers/media/pci/saa7164/saa7164-core.c:352:44: warning: cast removes address space of expression
drivers/media/pci/saa7164/saa7164-core.c:527:53: warning: cast removes address space of expression
drivers/media/pci/saa7164/saa7164-core.c:129:30: warning: dereference of noderef expression
drivers/media/pci/saa7164/saa7164-core.c:133:38: warning: dereference of noderef expression
drivers/media/pci/saa7164/saa7164-core.c:133:72: warning: dereference of noderef expression
drivers/media/pci/saa7164/saa7164-core.c:134:35: warning: dereference of noderef expression
drivers/media/pci/saa7164/saa7164-core.c:287:61: warning: dereference of noderef expression
drivers/media/pci/saa7164/saa7164-core.c:288:65: warning: dereference of noderef expression
drivers/media/pci/saa7164/saa7164-core.c:289:65: warning: dereference of noderef expression
drivers/media/pci/saa7164/saa7164-core.c:290:65: warning: dereference of noderef expression
drivers/media/pci/saa7164/saa7164-core.c:291:65: warning: dereference of noderef expression
drivers/media/pci/saa7164/saa7164-core.c:292:65: warning: dereference of noderef expression
drivers/media/pci/saa7164/saa7164-core.c:293:65: warning: dereference of noderef expression
drivers/media/pci/saa7164/saa7164-core.c:294:65: warning: dereference of noderef expression
drivers/media/pci/saa7164/saa7164-fw.c:548:52: warning: incorrect type in argument 5 (different address spaces)
drivers/media/pci/saa7164/saa7164-fw.c:548:52:    expected unsigned char [usertype] *dst
drivers/media/pci/saa7164/saa7164-fw.c:548:52:    got unsigned char [noderef] [usertype] <asn:2>*
drivers/media/pci/saa7164/saa7164-fw.c:579:44: warning: incorrect type in argument 5 (different address spaces)
drivers/media/pci/saa7164/saa7164-fw.c:579:44:    expected unsigned char [usertype] *dst
drivers/media/pci/saa7164/saa7164-fw.c:579:44:    got unsigned char [noderef] [usertype] <asn:2>*
drivers/media/pci/saa7164/saa7164-fw.c:597:44: warning: incorrect type in argument 5 (different address spaces)
drivers/media/pci/saa7164/saa7164-fw.c:597:44:    expected unsigned char [usertype] *dst
drivers/media/pci/saa7164/saa7164-fw.c:597:44:    got unsigned char [noderef] [usertype] <asn:2>*
drivers/media/pci/saa7164/saa7164-bus.c:36:36: warning: cast removes address space of expression
drivers/media/pci/saa7164/saa7164-bus.c:41:36: warning: cast removes address space of expression
drivers/media/pci/saa7164/saa7164-bus.c:151:19: warning: incorrect type in assignment (different base types)
drivers/media/pci/saa7164/saa7164-bus.c:151:19:    expected unsigned short [unsigned] [usertype] size
drivers/media/pci/saa7164/saa7164-bus.c:151:19:    got restricted __le16 [usertype] <noident>
drivers/media/pci/saa7164/saa7164-bus.c:152:22: warning: incorrect type in assignment (different base types)
drivers/media/pci/saa7164/saa7164-bus.c:152:22:    expected unsigned int [unsigned] [usertype] command
drivers/media/pci/saa7164/saa7164-bus.c:152:22:    got restricted __le32 [usertype] <noident>
drivers/media/pci/saa7164/saa7164-bus.c:153:30: warning: incorrect type in assignment (different base types)
drivers/media/pci/saa7164/saa7164-bus.c:153:30:    expected unsigned short [unsigned] [usertype] controlselector
drivers/media/pci/saa7164/saa7164-bus.c:153:30:    got restricted __le16 [usertype] <noident>
drivers/media/pci/saa7164/saa7164-bus.c:172:20: warning: cast to restricted __le32
drivers/media/pci/saa7164/saa7164-bus.c:173:20: warning: cast to restricted __le32
drivers/media/pci/saa7164/saa7164-bus.c:206:28: warning: cast to restricted __le32
drivers/media/pci/saa7164/saa7164-bus.c:287:9: warning: incorrect type in argument 1 (different base types)
drivers/media/pci/saa7164/saa7164-bus.c:287:9:    expected unsigned int [unsigned] val
drivers/media/pci/saa7164/saa7164-bus.c:287:9:    got restricted __le32 [usertype] <noident>
drivers/media/pci/saa7164/saa7164-bus.c:339:20: warning: cast to restricted __le32
drivers/media/pci/saa7164/saa7164-bus.c:340:20: warning: cast to restricted __le32
drivers/media/pci/saa7164/saa7164-bus.c:463:9: warning: incorrect type in argument 1 (different base types)
drivers/media/pci/saa7164/saa7164-bus.c:463:9:    expected unsigned int [unsigned] val
drivers/media/pci/saa7164/saa7164-bus.c:463:9:    got restricted __le32 [usertype] <noident>
drivers/media/pci/saa7164/saa7164-bus.c:466:21: warning: cast to restricted __le16
drivers/media/pci/saa7164/saa7164-bus.c:467:24: warning: cast to restricted __le32
drivers/media/pci/saa7164/saa7164-bus.c:468:32: warning: cast to restricted __le16
drivers/media/pci/saa7164/saa7164-buffer.c:122:18: warning: incorrect type in assignment (different address spaces)
drivers/media/pci/saa7164/saa7164-buffer.c:122:18:    expected unsigned long long [noderef] [usertype] <asn:2>*cpu
drivers/media/pci/saa7164/saa7164-buffer.c:122:18:    got void *
drivers/media/pci/saa7164/saa7164-buffer.c:127:21: warning: incorrect type in assignment (different address spaces)
drivers/media/pci/saa7164/saa7164-buffer.c:127:21:    expected unsigned long long [noderef] [usertype] <asn:2>*pt_cpu
drivers/media/pci/saa7164/saa7164-buffer.c:127:21:    got void *
drivers/media/pci/saa7164/saa7164-buffer.c:134:20: warning: cast removes address space of expression
drivers/media/pci/saa7164/saa7164-buffer.c:156:63: warning: incorrect type in argument 3 (different address spaces)
drivers/media/pci/saa7164/saa7164-buffer.c:156:63:    expected void *vaddr
drivers/media/pci/saa7164/saa7164-buffer.c:156:63:    got unsigned long long [noderef] [usertype] <asn:2>*cpu
drivers/media/pci/saa7164/saa7164-buffer.c:179:57: warning: incorrect type in argument 3 (different address spaces)
drivers/media/pci/saa7164/saa7164-buffer.c:179:57:    expected void *vaddr
drivers/media/pci/saa7164/saa7164-buffer.c:179:57:    got unsigned long long [noderef] [usertype] <asn:2>*cpu
drivers/media/pci/saa7164/saa7164-buffer.c:180:56: warning: incorrect type in argument 3 (different address spaces)
drivers/media/pci/saa7164/saa7164-buffer.c:180:56:    expected void *vaddr
drivers/media/pci/saa7164/saa7164-buffer.c:180:56:    got unsigned long long [noderef] [usertype] <asn:2>*pt_cpu
drivers/media/pci/saa7164/saa7164-buffer.c:84:17: warning: dereference of noderef expression
drivers/media/pci/saa7164/saa7164-buffer.c:147:31: warning: dereference of noderef expression
drivers/media/pci/saa7164/saa7164-buffer.c:148:17: warning: dereference of noderef expression

Most are caused by pointers marked as __iomem when they aren't or not marked as
__iomem when they should.

Also note that readl/writel already do endian conversion, so there is no need to
do it again.

saa7164_bus_set/get were a bit tricky: you have to make sure the msg endian
conversion is done at the right time, and that the code isn't using fields that
are still little endian instead of cpu-endianness.

The approach chosen is to convert just before writing to the ring buffer
and to convert it back right after reading from the ring buffer.

Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com>
Cc: Steven Toth <stoth@kernellabs.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
[bwh: Backported to 3.2: adjust filenames, context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
8 years agosaa7164: fix endian conversion in saa7164_bus_set()
Dan Carpenter [Mon, 28 Nov 2011 12:08:53 +0000 (09:08 -0300)]
saa7164: fix endian conversion in saa7164_bus_set()

commit 773ddbd228dc16a4829836e1dc16383e44c8575e upstream.

The msg->command field is 32 bits, and we should fill it with a call
to cpu_to_le32().  The current code is broke on big endian systems.
On little endian systems it truncates the 32 bit value to 16 bits
which probably still works fine.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Cc: Steven Toth <stoth@kernellabs.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
8 years agobtrfs: preserve i_mode if __btrfs_set_acl() fails
Ernesto A. Fernández [Wed, 2 Aug 2017 06:18:27 +0000 (03:18 -0300)]
btrfs: preserve i_mode if __btrfs_set_acl() fails

commit d7d824966530acfe32b94d1ed672e6fe1638cd68 upstream.

When changing a file's acl mask, btrfs_set_acl() will first set the
group bits of i_mode to the value of the mask, and only then set the
actual extended attribute representing the new acl.

If the second part fails (due to lack of space, for example) and the
file had no acl attribute to begin with, the system will from now on
assume that the mask permission bits are actual group permission bits,
potentially granting access to the wrong users.

Prevent this by restoring the original mode bits if __btrfs_set_acl
fails.

Signed-off-by: Ernesto A. Fernández <ernesto.mnd.fernandez@gmail.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
8 years agoext4: Don't clear SGID when inheriting ACLs
Jan Kara [Mon, 31 Jul 2017 03:33:01 +0000 (23:33 -0400)]
ext4: Don't clear SGID when inheriting ACLs

commit a3bb2d5587521eea6dab2d05326abb0afb460abd upstream.

When new directory 'DIR1' is created in a directory 'DIR0' with SGID bit
set, DIR1 is expected to have SGID bit set (and owning group equal to
the owning group of 'DIR0'). However when 'DIR0' also has some default
ACLs that 'DIR1' inherits, setting these ACLs will result in SGID bit on
'DIR1' to get cleared if user is not member of the owning group.

Fix the problem by moving posix_acl_update_mode() out of
__ext4_set_acl() into ext4_set_acl(). That way the function will not be
called when inheriting ACLs which is what we want as it prevents SGID
bit clearing and the mode has been properly set by posix_acl_create()
anyway.

Fixes: 073931017b49d9458aa351605b43a7e34598caef
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Jan Kara <jack@suse.cz>
Reviewed-by: Andreas Gruenbacher <agruenba@redhat.com>
[bwh: Backported to 3.2: the __ext4_set_acl() function didn't exist,
 so added it]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
8 years agoext4: preserve i_mode if __ext4_set_acl() fails
Ernesto A. Fernández [Mon, 31 Jul 2017 02:43:41 +0000 (22:43 -0400)]
ext4: preserve i_mode if __ext4_set_acl() fails

commit 397e434176bb62bc6068d2210af1d876c6212a7e upstream.

When changing a file's acl mask, __ext4_set_acl() will first set the group
bits of i_mode to the value of the mask, and only then set the actual
extended attribute representing the new acl.

If the second part fails (due to lack of space, for example) and the file
had no acl attribute to begin with, the system will from now on assume
that the mask permission bits are actual group permission bits, potentially
granting access to the wrong users.

Prevent this by only changing the inode mode after the acl has been set.

Signed-off-by: Ernesto A. Fernández <ernesto.mnd.fernandez@gmail.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Jan Kara <jack@suse.cz>
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
8 years agoreiserfs: preserve i_mode if __reiserfs_set_acl() fails
Ernesto A. Fernández [Mon, 17 Jul 2017 16:42:41 +0000 (18:42 +0200)]
reiserfs: preserve i_mode if __reiserfs_set_acl() fails

commit fcea8aed91f53b51f9b943dc01f12d8aa666c720 upstream.

When changing a file's acl mask, reiserfs_set_acl() will first set the
group bits of i_mode to the value of the mask, and only then set the
actual extended attribute representing the new acl.

If the second part fails (due to lack of space, for example) and the
file had no acl attribute to begin with, the system will from now on
assume that the mask permission bits are actual group permission bits,
potentially granting access to the wrong users.

Prevent this by only changing the inode mode after the acl has been set.

Signed-off-by: Ernesto A. Fernández <ernesto.mnd.fernandez@gmail.com>
Signed-off-by: Jan Kara <jack@suse.cz>
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
8 years agoreiserfs: Don't clear SGID when inheriting ACLs
Jan Kara [Thu, 22 Jun 2017 07:32:49 +0000 (09:32 +0200)]
reiserfs: Don't clear SGID when inheriting ACLs

commit 6883cd7f68245e43e91e5ee583b7550abf14523f upstream.

When new directory 'DIR1' is created in a directory 'DIR0' with SGID bit
set, DIR1 is expected to have SGID bit set (and owning group equal to
the owning group of 'DIR0'). However when 'DIR0' also has some default
ACLs that 'DIR1' inherits, setting these ACLs will result in SGID bit on
'DIR1' to get cleared if user is not member of the owning group.

Fix the problem by moving posix_acl_update_mode() out of
__reiserfs_set_acl() into reiserfs_set_acl(). That way the function will
not be called when inheriting ACLs which is what we want as it prevents
SGID bit clearing and the mode has been properly set by
posix_acl_create() anyway.

Fixes: 073931017b49d9458aa351605b43a7e34598caef
CC: reiserfs-devel@vger.kernel.org
Signed-off-by: Jan Kara <jack@suse.cz>
[bwh: Backported to 3.2: the __reiserfs_set_acl() function didn't exist,
 so added it]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
8 years agoext3: preserve i_mode if ext2_set_acl() fails
Ben Hutchings [Sun, 8 Oct 2017 13:48:44 +0000 (14:48 +0100)]
ext3: preserve i_mode if ext2_set_acl() fails

Based on Ernesto A. Fernández's fix for ext2 (commit fe26569eb919), from
which the following description is taken:

> When changing a file's acl mask, ext2_set_acl() will first set the group
> bits of i_mode to the value of the mask, and only then set the actual
> extended attribute representing the new acl.
>
> If the second part fails (due to lack of space, for example) and the file
> had no acl attribute to begin with, the system will from now on assume
> that the mask permission bits are actual group permission bits, potentially
> granting access to the wrong users.
>
> Prevent this by only changing the inode mode after the acl has been set.

Cc: Ernesto A. Fernández <ernesto.mnd.fernandez@gmail.com>
Cc: Jan Kara <jack@suse.cz>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>