From: Russell King Date: Thu, 22 Sep 2011 21:39:23 +0000 (+0100) Subject: Merge branch 'pm' into devel-stable X-Git-Tag: v3.2-rc1~130^2~5 X-Git-Url: http://git.openpandora.org/cgi-bin/gitweb.cgi?a=commitdiff_plain;h=b0a37dca72a05b7b579f288d8a67afeed96bffa5;hp=8e6f83bbdf770014c070c5a41c8e89617cb2a66b;p=pandora-kernel.git Merge branch 'pm' into devel-stable --- diff --git a/Documentation/00-INDEX b/Documentation/00-INDEX index 1f89424c36a6..65bbd2622396 100644 --- a/Documentation/00-INDEX +++ b/Documentation/00-INDEX @@ -272,6 +272,8 @@ printk-formats.txt - how to get printk format specifiers right prio_tree.txt - info on radix-priority-search-tree use for indexing vmas. +ramoops.txt + - documentation of the ramoops oops/panic logging module. rbtree.txt - info on what red-black trees are and what they are for. robust-futex-ABI.txt diff --git a/Documentation/DocBook/media/v4l/controls.xml b/Documentation/DocBook/media/v4l/controls.xml index 85164016ed26..23fdf79f8cf3 100644 --- a/Documentation/DocBook/media/v4l/controls.xml +++ b/Documentation/DocBook/media/v4l/controls.xml @@ -1455,7 +1455,7 @@ Applicable to the H264 encoder. - + V4L2_CID_MPEG_VIDEO_H264_VUI_SAR_IDC  enum v4l2_mpeg_video_h264_vui_sar_idc @@ -1561,7 +1561,7 @@ Applicable to the H264 encoder. - + V4L2_CID_MPEG_VIDEO_H264_LEVEL  enum v4l2_mpeg_video_h264_level @@ -1641,7 +1641,7 @@ Possible values are: - + V4L2_CID_MPEG_VIDEO_MPEG4_LEVEL  enum v4l2_mpeg_video_mpeg4_level @@ -1689,9 +1689,9 @@ Possible values are: - + V4L2_CID_MPEG_VIDEO_H264_PROFILE  - enum v4l2_mpeg_h264_profile + enum v4l2_mpeg_video_h264_profile The profile information for H264. Applicable to the H264 encoder. @@ -1774,9 +1774,9 @@ Possible values are: - + V4L2_CID_MPEG_VIDEO_MPEG4_PROFILE  - enum v4l2_mpeg_mpeg4_profile + enum v4l2_mpeg_video_mpeg4_profile The profile information for MPEG4. Applicable to the MPEG4 encoder. @@ -1820,9 +1820,9 @@ Applicable to the encoder. - + V4L2_CID_MPEG_VIDEO_MULTI_SLICE_MODE  - enum v4l2_mpeg_multi_slice_mode + enum v4l2_mpeg_video_multi_slice_mode Determines how the encoder should handle division of frame into slices. Applicable to the encoder. @@ -1868,9 +1868,9 @@ Applicable to the encoder. - + V4L2_CID_MPEG_VIDEO_H264_LOOP_FILTER_MODE  - enum v4l2_mpeg_h264_loop_filter_mode + enum v4l2_mpeg_video_h264_loop_filter_mode Loop filter mode for H264 encoder. Possible values are: @@ -1913,9 +1913,9 @@ Applicable to the H264 encoder. - + V4L2_CID_MPEG_VIDEO_H264_ENTROPY_MODE  - enum v4l2_mpeg_h264_symbol_mode + enum v4l2_mpeg_video_h264_entropy_mode Entropy coding mode for H264 - CABAC/CAVALC. Applicable to the H264 encoder. @@ -2140,9 +2140,9 @@ previous frames. Applicable to the H264 encoder. - + V4L2_CID_MPEG_VIDEO_HEADER_MODE  - enum v4l2_mpeg_header_mode + enum v4l2_mpeg_video_header_mode Determines whether the header is returned as the first buffer or is it returned together with the first frame. Applicable to encoders. @@ -2320,9 +2320,9 @@ Valid only when H.264 and macroblock level RC is enabled (V4L2_CID_MPE Applicable to the H264 encoder. - + V4L2_CID_MPEG_MFC51_VIDEO_FRAME_SKIP_MODE  - enum v4l2_mpeg_mfc51_frame_skip_mode + enum v4l2_mpeg_mfc51_video_frame_skip_mode Indicates in what conditions the encoder should skip frames. If encoding a frame would cause the encoded stream to be larger then @@ -2361,9 +2361,9 @@ the stream will meet tight bandwidth contraints. Applicable to encoders. - + V4L2_CID_MPEG_MFC51_VIDEO_FORCE_FRAME_TYPE  - enum v4l2_mpeg_mfc51_force_frame_type + enum v4l2_mpeg_mfc51_video_force_frame_type Force a frame type for the next queued buffer. Applicable to encoders. Possible values are: diff --git a/Documentation/PCI/MSI-HOWTO.txt b/Documentation/PCI/MSI-HOWTO.txt index 3f5e0b09bed5..53e6fca146d7 100644 --- a/Documentation/PCI/MSI-HOWTO.txt +++ b/Documentation/PCI/MSI-HOWTO.txt @@ -45,7 +45,7 @@ arrived in memory (this becomes more likely with devices behind PCI-PCI bridges). In order to ensure that all the data has arrived in memory, the interrupt handler must read a register on the device which raised the interrupt. PCI transaction ordering rules require that all the data -arrives in memory before the value can be returned from the register. +arrive in memory before the value may be returned from the register. Using MSIs avoids this problem as the interrupt-generating write cannot pass the data writes, so by the time the interrupt is raised, the driver knows that all the data has arrived in memory. @@ -86,13 +86,13 @@ device. int pci_enable_msi(struct pci_dev *dev) -A successful call will allocate ONE interrupt to the device, regardless -of how many MSIs the device supports. The device will be switched from +A successful call allocates ONE interrupt to the device, regardless +of how many MSIs the device supports. The device is switched from pin-based interrupt mode to MSI mode. The dev->irq number is changed -to a new number which represents the message signaled interrupt. -This function should be called before the driver calls request_irq() -since enabling MSIs disables the pin-based IRQ and the driver will not -receive interrupts on the old interrupt. +to a new number which represents the message signaled interrupt; +consequently, this function should be called before the driver calls +request_irq(), because an MSI is delivered via a vector that is +different from the vector of a pin-based interrupt. 4.2.2 pci_enable_msi_block @@ -111,20 +111,20 @@ the device are in the range dev->irq to dev->irq + count - 1. If this function returns a negative number, it indicates an error and the driver should not attempt to request any more MSI interrupts for -this device. If this function returns a positive number, it will be -less than 'count' and indicate the number of interrupts that could have -been allocated. In neither case will the irq value have been -updated, nor will the device have been switched into MSI mode. +this device. If this function returns a positive number, it is +less than 'count' and indicates the number of interrupts that could have +been allocated. In neither case is the irq value updated or the device +switched into MSI mode. The device driver must decide what action to take if -pci_enable_msi_block() returns a value less than the number asked for. -Some devices can make use of fewer interrupts than the maximum they -request; in this case the driver should call pci_enable_msi_block() +pci_enable_msi_block() returns a value less than the number requested. +For instance, the driver could still make use of fewer interrupts; +in this case the driver should call pci_enable_msi_block() again. Note that it is not guaranteed to succeed, even when the 'count' has been reduced to the value returned from a previous call to pci_enable_msi_block(). This is because there are multiple constraints on the number of vectors that can be allocated; pci_enable_msi_block() -will return as soon as it finds any constraint that doesn't allow the +returns as soon as it finds any constraint that doesn't allow the call to succeed. 4.2.3 pci_disable_msi @@ -137,10 +137,10 @@ interrupt number and frees the previously allocated message signaled interrupt(s). The interrupt may subsequently be assigned to another device, so drivers should not cache the value of dev->irq. -A device driver must always call free_irq() on the interrupt(s) -for which it has called request_irq() before calling this function. -Failure to do so will result in a BUG_ON(), the device will be left with -MSI enabled and will leak its vector. +Before calling this function, a device driver must always call free_irq() +on any interrupt for which it previously called request_irq(). +Failure to do so results in a BUG_ON(), leaving the device with +MSI enabled and thus leaking its vector. 4.3 Using MSI-X @@ -155,10 +155,10 @@ struct msix_entry { }; This allows for the device to use these interrupts in a sparse fashion; -for example it could use interrupts 3 and 1027 and allocate only a +for example, it could use interrupts 3 and 1027 and yet allocate only a two-element array. The driver is expected to fill in the 'entry' value -in each element of the array to indicate which entries it wants the kernel -to assign interrupts for. It is invalid to fill in two entries with the +in each element of the array to indicate for which entries the kernel +should assign interrupts; it is invalid to fill in two entries with the same number. 4.3.1 pci_enable_msix @@ -168,10 +168,11 @@ int pci_enable_msix(struct pci_dev *dev, struct msix_entry *entries, int nvec) Calling this function asks the PCI subsystem to allocate 'nvec' MSIs. The 'entries' argument is a pointer to an array of msix_entry structs which should be at least 'nvec' entries in size. On success, the -function will return 0 and the device will have been switched into -MSI-X interrupt mode. The 'vector' elements in each entry will have -been filled in with the interrupt number. The driver should then call -request_irq() for each 'vector' that it decides to use. +device is switched into MSI-X mode and the function returns 0. +The 'vector' member in each entry is populated with the interrupt number; +the driver should then call request_irq() for each 'vector' that it +decides to use. The device driver is responsible for keeping track of the +interrupts assigned to the MSI-X vectors so it can free them again later. If this function returns a negative number, it indicates an error and the driver should not attempt to allocate any more MSI-X interrupts for @@ -181,16 +182,14 @@ below. This function, in contrast with pci_enable_msi(), does not adjust dev->irq. The device will not generate interrupts for this interrupt -number once MSI-X is enabled. The device driver is responsible for -keeping track of the interrupts assigned to the MSI-X vectors so it can -free them again later. +number once MSI-X is enabled. Device drivers should normally call this function once per device during the initialization phase. -It is ideal if drivers can cope with a variable number of MSI-X interrupts, +It is ideal if drivers can cope with a variable number of MSI-X interrupts; there are many reasons why the platform may not be able to provide the -exact number a driver asks for. +exact number that a driver asks for. A request loop to achieve that might look like: @@ -212,15 +211,15 @@ static int foo_driver_enable_msix(struct foo_adapter *adapter, int nvec) void pci_disable_msix(struct pci_dev *dev) -This API should be used to undo the effect of pci_enable_msix(). It frees +This function should be used to undo the effect of pci_enable_msix(). It frees the previously allocated message signaled interrupts. The interrupts may subsequently be assigned to another device, so drivers should not cache the value of the 'vector' elements over a call to pci_disable_msix(). -A device driver must always call free_irq() on the interrupt(s) -for which it has called request_irq() before calling this function. -Failure to do so will result in a BUG_ON(), the device will be left with -MSI enabled and will leak its vector. +Before calling this function, a device driver must always call free_irq() +on any interrupt for which it previously called request_irq(). +Failure to do so results in a BUG_ON(), leaving the device with +MSI-X enabled and thus leaking its vector. 4.3.3 The MSI-X Table @@ -232,10 +231,10 @@ mask or unmask an interrupt, it should call disable_irq() / enable_irq(). 4.4 Handling devices implementing both MSI and MSI-X capabilities If a device implements both MSI and MSI-X capabilities, it can -run in either MSI mode or MSI-X mode but not both simultaneously. +run in either MSI mode or MSI-X mode, but not both simultaneously. This is a requirement of the PCI spec, and it is enforced by the PCI layer. Calling pci_enable_msi() when MSI-X is already enabled or -pci_enable_msix() when MSI is already enabled will result in an error. +pci_enable_msix() when MSI is already enabled results in an error. If a device driver wishes to switch between MSI and MSI-X at runtime, it must first quiesce the device, then switch it back to pin-interrupt mode, before calling pci_enable_msi() or pci_enable_msix() and resuming @@ -251,7 +250,7 @@ the MSI-X facilities in preference to the MSI facilities. As mentioned above, MSI-X supports any number of interrupts between 1 and 2048. In constrast, MSI is restricted to a maximum of 32 interrupts (and must be a power of two). In addition, the MSI interrupt vectors must -be allocated consecutively, so the system may not be able to allocate +be allocated consecutively, so the system might not be able to allocate as many vectors for MSI as it could for MSI-X. On some platforms, MSI interrupts must all be targeted at the same set of CPUs whereas MSI-X interrupts can all be targeted at different CPUs. @@ -281,7 +280,7 @@ disabled to enabled and back again. Using 'lspci -v' (as root) may show some devices with "MSI", "Message Signalled Interrupts" or "MSI-X" capabilities. Each of these capabilities -has an 'Enable' flag which will be followed with either "+" (enabled) +has an 'Enable' flag which is followed with either "+" (enabled) or "-" (disabled). @@ -298,7 +297,7 @@ The PCI stack provides three ways to disable MSIs: Some host chipsets simply don't support MSIs properly. If we're lucky, the manufacturer knows this and has indicated it in the ACPI -FADT table. In this case, Linux will automatically disable MSIs. +FADT table. In this case, Linux automatically disables MSIs. Some boards don't include this information in the table and so we have to detect them ourselves. The complete list of these is found near the quirk_disable_all_msi() function in drivers/pci/quirks.c. @@ -317,7 +316,7 @@ Some bridges allow you to enable MSIs by changing some bits in their PCI configuration space (especially the Hypertransport chipsets such as the nVidia nForce and Serverworks HT2000). As with host chipsets, Linux mostly knows about them and automatically enables MSIs if it can. -If you have a bridge which Linux doesn't yet know about, you can enable +If you have a bridge unknown to Linux, you can enable MSIs in configuration space using whatever method you know works, then enable MSIs on that bridge by doing: @@ -327,7 +326,7 @@ where $bridge is the PCI address of the bridge you've enabled (eg 0000:00:0e.0). To disable MSIs, echo 0 instead of 1. Changing this value should be -done with caution as it can break interrupt handling for all devices +done with caution as it could break interrupt handling for all devices below this bridge. Again, please notify linux-pci@vger.kernel.org of any bridges that need @@ -336,7 +335,7 @@ special handling. 5.3. Disabling MSIs on a single device Some devices are known to have faulty MSI implementations. Usually this -is handled in the individual device driver but occasionally it's necessary +is handled in the individual device driver, but occasionally it's necessary to handle this with a quirk. Some drivers have an option to disable use of MSI. While this is a convenient workaround for the driver author, it is not good practise, and should not be emulated. @@ -350,7 +349,7 @@ for your machine. You should also check your .config to be sure you have enabled CONFIG_PCI_MSI. Then, 'lspci -t' gives the list of bridges above a device. Reading -/sys/bus/pci/devices/*/msi_bus will tell you whether MSI are enabled (1) +/sys/bus/pci/devices/*/msi_bus will tell you whether MSIs are enabled (1) or disabled (0). If 0 is found in any of the msi_bus files belonging to bridges between the PCI root and the device, MSIs are disabled. diff --git a/Documentation/SubmittingDrivers b/Documentation/SubmittingDrivers index 319baa8b60dd..36d16bbf72c6 100644 --- a/Documentation/SubmittingDrivers +++ b/Documentation/SubmittingDrivers @@ -130,7 +130,7 @@ Linux kernel master tree: ftp.??.kernel.org:/pub/linux/kernel/... ?? == your country code, such as "us", "uk", "fr", etc. - http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git + http://git.kernel.org/?p=linux/kernel/git/torvalds/linux.git Linux kernel mailing list: linux-kernel@vger.kernel.org diff --git a/Documentation/SubmittingPatches b/Documentation/SubmittingPatches index 569f3532e138..4468ce24427c 100644 --- a/Documentation/SubmittingPatches +++ b/Documentation/SubmittingPatches @@ -303,7 +303,7 @@ patches that are being emailed around. The sign-off is a simple line at the end of the explanation for the patch, which certifies that you wrote it or otherwise have the right to -pass it on as a open-source patch. The rules are pretty simple: if you +pass it on as an open-source patch. The rules are pretty simple: if you can certify the below: Developer's Certificate of Origin 1.1 diff --git a/Documentation/block/cfq-iosched.txt b/Documentation/block/cfq-iosched.txt index e578feed6d81..6d670f570451 100644 --- a/Documentation/block/cfq-iosched.txt +++ b/Documentation/block/cfq-iosched.txt @@ -43,3 +43,74 @@ If one sets slice_idle=0 and if storage supports NCQ, CFQ internally switches to IOPS mode and starts providing fairness in terms of number of requests dispatched. Note that this mode switching takes effect only for group scheduling. For non-cgroup users nothing should change. + +CFQ IO scheduler Idling Theory +=============================== +Idling on a queue is primarily about waiting for the next request to come +on same queue after completion of a request. In this process CFQ will not +dispatch requests from other cfq queues even if requests are pending there. + +The rationale behind idling is that it can cut down on number of seeks +on rotational media. For example, if a process is doing dependent +sequential reads (next read will come on only after completion of previous +one), then not dispatching request from other queue should help as we +did not move the disk head and kept on dispatching sequential IO from +one queue. + +CFQ has following service trees and various queues are put on these trees. + + sync-idle sync-noidle async + +All cfq queues doing synchronous sequential IO go on to sync-idle tree. +On this tree we idle on each queue individually. + +All synchronous non-sequential queues go on sync-noidle tree. Also any +request which are marked with REQ_NOIDLE go on this service tree. On this +tree we do not idle on individual queues instead idle on the whole group +of queues or the tree. So if there are 4 queues waiting for IO to dispatch +we will idle only once last queue has dispatched the IO and there is +no more IO on this service tree. + +All async writes go on async service tree. There is no idling on async +queues. + +CFQ has some optimizations for SSDs and if it detects a non-rotational +media which can support higher queue depth (multiple requests at in +flight at a time), then it cuts down on idling of individual queues and +all the queues move to sync-noidle tree and only tree idle remains. This +tree idling provides isolation with buffered write queues on async tree. + +FAQ +=== +Q1. Why to idle at all on queues marked with REQ_NOIDLE. + +A1. We only do tree idle (all queues on sync-noidle tree) on queues marked + with REQ_NOIDLE. This helps in providing isolation with all the sync-idle + queues. Otherwise in presence of many sequential readers, other + synchronous IO might not get fair share of disk. + + For example, if there are 10 sequential readers doing IO and they get + 100ms each. If a REQ_NOIDLE request comes in, it will be scheduled + roughly after 1 second. If after completion of REQ_NOIDLE request we + do not idle, and after a couple of milli seconds a another REQ_NOIDLE + request comes in, again it will be scheduled after 1second. Repeat it + and notice how a workload can lose its disk share and suffer due to + multiple sequential readers. + + fsync can generate dependent IO where bunch of data is written in the + context of fsync, and later some journaling data is written. Journaling + data comes in only after fsync has finished its IO (atleast for ext4 + that seemed to be the case). Now if one decides not to idle on fsync + thread due to REQ_NOIDLE, then next journaling write will not get + scheduled for another second. A process doing small fsync, will suffer + badly in presence of multiple sequential readers. + + Hence doing tree idling on threads using REQ_NOIDLE flag on requests + provides isolation from multiple sequential readers and at the same + time we do not idle on individual threads. + +Q2. When to specify REQ_NOIDLE +A2. I would think whenever one is doing synchronous write and not expecting + more writes to be dispatched from same context soon, should be able + to specify REQ_NOIDLE on writes and that probably should work well for + most of the cases. diff --git a/Documentation/email-clients.txt b/Documentation/email-clients.txt index a0b58e29f911..860c29a472ad 100644 --- a/Documentation/email-clients.txt +++ b/Documentation/email-clients.txt @@ -199,18 +199,16 @@ to coerce it into behaving. To beat some sense out of the internal editor, do this: -- Under account settings, composition and addressing, uncheck "Compose - messages in HTML format". - - Edit your Thunderbird config settings so that it won't use format=flowed. Go to "edit->preferences->advanced->config editor" to bring up the thunderbird's registry editor, and set "mailnews.send_plaintext_flowed" to "false". -- Enable "preformat" mode: Shft-click on the Write icon to bring up the HTML - composer, select "Preformat" from the drop-down box just under the subject - line, then close the message without saving. (This setting also applies to - the text composer, but the only control for it is in the HTML composer.) +- Disable HTML Format: Set "mail.identity.id1.compose_html" to "false". + +- Enable "preformat" mode: Set "editor.quotesPreformatted" to "true". + +- Enable UTF8: Set "prefs.converted-to-utf8" to "true". - Install the "toggle wordwrap" extension. Download the file from: https://addons.mozilla.org/thunderbird/addon/2351/ diff --git a/Documentation/feature-removal-schedule.txt b/Documentation/feature-removal-schedule.txt index c4a6e148732a..4dc465477665 100644 --- a/Documentation/feature-removal-schedule.txt +++ b/Documentation/feature-removal-schedule.txt @@ -592,3 +592,11 @@ Why: In 3.0, we can now autodetect internal 3G device and already have interface that was used by acer-wmi driver. It will replaced by information log when acer-wmi initial. Who: Lee, Chun-Yi + +---------------------------- +What: The XFS nodelaylog mount option +When: 3.3 +Why: The delaylog mode that has been the default since 2.6.39 has proven + stable, and the old code is in the way of additional improvements in + the log code. +Who: Christoph Hellwig diff --git a/Documentation/filesystems/befs.txt b/Documentation/filesystems/befs.txt index 6e49c363938e..da45e6c842b8 100644 --- a/Documentation/filesystems/befs.txt +++ b/Documentation/filesystems/befs.txt @@ -27,7 +27,7 @@ His original code can still be found at: Does anyone know of a more current email address for Makoto? He doesn't respond to the address given above... -Current maintainer: Sergey S. Kostyliov +This filesystem doesn't have a maintainer. WHAT IS THIS DRIVER? ================== diff --git a/Documentation/hwmon/max16065 b/Documentation/hwmon/max16065 index 44b4f61e04f9..c11f64a1f2ad 100644 --- a/Documentation/hwmon/max16065 +++ b/Documentation/hwmon/max16065 @@ -62,6 +62,13 @@ can be safely used to identify the chip. You will have to instantiate the devices explicitly. Please see Documentation/i2c/instantiating-devices for details. +WARNING: Do not access chip registers using the i2cdump command, and do not use +any of the i2ctools commands on a command register (0xa5 to 0xac). The chips +supported by this driver interpret any access to a command register (including +read commands) as request to execute the command in question. This may result in +power loss, board resets, and/or Flash corruption. Worst case, your board may +turn into a brick. + Sysfs entries ------------- diff --git a/Documentation/ioctl/ioctl-number.txt b/Documentation/ioctl/ioctl-number.txt index 845a191004b1..54078ed96b37 100644 --- a/Documentation/ioctl/ioctl-number.txt +++ b/Documentation/ioctl/ioctl-number.txt @@ -319,4 +319,6 @@ Code Seq#(hex) Include File Comments 0xF4 00-1F video/mbxfb.h mbxfb +0xF6 all LTTng Linux Trace Toolkit Next Generation + 0xFD all linux/dm-ioctl.h diff --git a/Documentation/kernel-docs.txt b/Documentation/kernel-docs.txt index 9a8674629a07..0e0734b509d8 100644 --- a/Documentation/kernel-docs.txt +++ b/Documentation/kernel-docs.txt @@ -620,17 +620,6 @@ (including this document itself) have been moved there, and might be more up to date than the web version. - * Name: "Linux Source Driver" - URL: http://lsd.linux.cz - Keywords: Browsing source code. - Description: "Linux Source Driver (LSD) is an application, which - can make browsing source codes of Linux kernel easier than you can - imagine. You can select between multiple versions of kernel (e.g. - 0.01, 1.0.0, 2.0.33, 2.0.34pre13, 2.0.0, 2.1.101 etc.). With LSD - you can search Linux kernel (fulltext, macros, types, functions - and variables) and LSD can generate patches for you on the fly - (files, directories or kernel)". - * Name: "Linux Kernel Source Reference" Author: Thomas Graichen. URL: http://marc.info/?l=linux-kernel&m=96446640102205&w=4 diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt index 78926aa2531c..614d0382e2cb 100644 --- a/Documentation/kernel-parameters.txt +++ b/Documentation/kernel-parameters.txt @@ -40,6 +40,7 @@ parameter is applicable: ALSA ALSA sound support is enabled. APIC APIC support is enabled. APM Advanced Power Management support is enabled. + ARM ARM architecture is enabled. AVR32 AVR32 architecture is enabled. AX25 Appropriate AX.25 support is enabled. BLACKFIN Blackfin architecture is enabled. @@ -49,6 +50,7 @@ parameter is applicable: EFI EFI Partitioning (GPT) is enabled EIDE EIDE/ATAPI support is enabled. FB The frame buffer device is enabled. + FTRACE Function tracing enabled. GCOV GCOV profiling is enabled. HW Appropriate hardware is enabled. IA-64 IA-64 architecture is enabled. @@ -69,6 +71,7 @@ parameter is applicable: Documentation/m68k/kernel-options.txt. MCA MCA bus support is enabled. MDA MDA console support is enabled. + MIPS MIPS architecture is enabled. MOUSE Appropriate mouse support is enabled. MSI Message Signaled Interrupts (PCI). MTD MTD (Memory Technology Device) support is enabled. @@ -100,7 +103,6 @@ parameter is applicable: SPARC Sparc architecture is enabled. SWSUSP Software suspend (hibernation) is enabled. SUSPEND System suspend states are enabled. - FTRACE Function tracing enabled. TPM TPM drivers are enabled. TS Appropriate touchscreen support is enabled. UMS USB Mass Storage support is enabled. @@ -115,7 +117,7 @@ parameter is applicable: X86-64 X86-64 architecture is enabled. More X86-64 boot options can be found in Documentation/x86/x86_64/boot-options.txt . - X86 Either 32bit or 64bit x86 (same as X86-32+X86-64) + X86 Either 32-bit or 64-bit x86 (same as X86-32+X86-64) XEN Xen support is enabled In addition, the following text indicates that the option: @@ -376,7 +378,7 @@ bytes respectively. Such letter suffixes can also be entirely omitted. atkbd.softrepeat= [HW] Use software keyboard repeat - autotest [IA64] + autotest [IA-64] baycom_epp= [HW,AX25] Format: , @@ -681,8 +683,8 @@ bytes respectively. Such letter suffixes can also be entirely omitted. uart[8250],mmio32,[,options] Start an early, polled-mode console on the 8250/16550 UART at the specified I/O port or MMIO address. - MMIO inter-register address stride is either 8bit (mmio) - or 32bit (mmio32). + MMIO inter-register address stride is either 8-bit + (mmio) or 32-bit (mmio32). The options are the same as for ttyS, above. earlyprintk= [X86,SH,BLACKFIN] @@ -725,7 +727,7 @@ bytes respectively. Such letter suffixes can also be entirely omitted. See Documentation/block/as-iosched.txt and Documentation/block/deadline-iosched.txt for details. - elfcorehdr= [IA64,PPC,SH,X86] + elfcorehdr= [IA-64,PPC,SH,X86] Specifies physical address of start of kernel core image elf header. Generally kexec loader will pass this option to capture kernel. @@ -791,7 +793,7 @@ bytes respectively. Such letter suffixes can also be entirely omitted. tracer at boot up. function-list is a comma separated list of functions. This list can be changed at run time by the set_ftrace_filter file in the debugfs - tracing directory. + tracing directory. ftrace_notrace=[function-list] [FTRACE] Do not trace the functions specified in @@ -829,7 +831,7 @@ bytes respectively. Such letter suffixes can also be entirely omitted. hashdist= [KNL,NUMA] Large hashes allocated during boot are distributed across NUMA nodes. Defaults on - for 64bit NUMA, off otherwise. + for 64-bit NUMA, off otherwise. Format: 0 | 1 (for off | on) hcl= [IA-64] SGI's Hardware Graph compatibility layer @@ -998,10 +1000,10 @@ bytes respectively. Such letter suffixes can also be entirely omitted. DMA. forcedac [x86_64] With this option iommu will not optimize to look - for io virtual address below 32 bit forcing dual + for io virtual address below 32-bit forcing dual address cycle on pci bus for cards supporting greater - than 32 bit addressing. The default is to look - for translation below 32 bit and if not available + than 32-bit addressing. The default is to look + for translation below 32-bit and if not available then look in the higher range. strict [Default Off] With this option on every unmap_single operation will @@ -1017,7 +1019,7 @@ bytes respectively. Such letter suffixes can also be entirely omitted. off disable Interrupt Remapping nosid disable Source ID checking - inttest= [IA64] + inttest= [IA-64] iomem= Disable strict checking of access to MMIO memory strict regions from userspace. @@ -1034,7 +1036,7 @@ bytes respectively. Such letter suffixes can also be entirely omitted. nomerge forcesac soft - pt [x86, IA64] + pt [x86, IA-64] io7= [HW] IO7 for Marvel based alpha systems See comment before marvel_specify_io7 in @@ -1165,7 +1167,7 @@ bytes respectively. Such letter suffixes can also be entirely omitted. kvm-amd.npt= [KVM,AMD] Disable nested paging (virtualized MMU) for all guests. - Default is 1 (enabled) if in 64bit or 32bit-PAE mode + Default is 1 (enabled) if in 64-bit or 32-bit PAE mode. kvm-intel.ept= [KVM,Intel] Disable extended page tables (virtualized MMU) support on capable Intel chips. @@ -1202,10 +1204,10 @@ bytes respectively. Such letter suffixes can also be entirely omitted. libata.dma=0 Disable all PATA and SATA DMA libata.dma=1 PATA and SATA Disk DMA only libata.dma=2 ATAPI (CDROM) DMA only - libata.dma=4 Compact Flash DMA only + libata.dma=4 Compact Flash DMA only Combinations also work, so libata.dma=3 enables DMA for disks and CDROMs, but not CFs. - + libata.ignore_hpa= [LIBATA] Ignore HPA limit libata.ignore_hpa=0 keep BIOS limits (default) libata.ignore_hpa=1 ignore limits, using full disk @@ -1331,7 +1333,7 @@ bytes respectively. Such letter suffixes can also be entirely omitted. ltpc= [NET] Format: ,, - machvec= [IA64] Force the use of a particular machine-vector + machvec= [IA-64] Force the use of a particular machine-vector (machvec) in a generic kernel. Example: machvec=hpzx1_swiotlb @@ -1348,9 +1350,12 @@ bytes respectively. Such letter suffixes can also be entirely omitted. it is equivalent to "nosmp", which also disables the IO APIC. - max_loop= [LOOP] Maximum number of loopback devices that can - be mounted - Format: <1-256> + max_loop= [LOOP] The number of loop block devices that get + (loop.max_loop) unconditionally pre-created at init time. The default + number is configured by BLK_DEV_LOOP_MIN_COUNT. Instead + of statically allocating a predefined number, loop + devices can be requested on-demand with the + /dev/loop-control interface. mcatest= [IA-64] @@ -1734,7 +1739,7 @@ bytes respectively. Such letter suffixes can also be entirely omitted. nointroute [IA-64] - nojitter [IA64] Disables jitter checking for ITC timers. + nojitter [IA-64] Disables jitter checking for ITC timers. no-kvmclock [X86,KVM] Disable paravirtualized KVM clock driver @@ -1800,7 +1805,7 @@ bytes respectively. Such letter suffixes can also be entirely omitted. nox2apic [X86-64,APIC] Do not enable x2APIC mode. - nptcg= [IA64] Override max number of concurrent global TLB + nptcg= [IA-64] Override max number of concurrent global TLB purges which is reported from either PAL_VM_SUMMARY or SAL PALO. @@ -2077,7 +2082,7 @@ bytes respectively. Such letter suffixes can also be entirely omitted. Format: { parport | timid | 0 } See also Documentation/parport.txt. - pmtmr= [X86] Manual setup of pmtmr I/O Port. + pmtmr= [X86] Manual setup of pmtmr I/O Port. Override pmtimer IOPort with a hex value. e.g. pmtmr=0x508 @@ -2635,6 +2640,16 @@ bytes respectively. Such letter suffixes can also be entirely omitted. medium is write-protected). Example: quirks=0419:aaf5:rl,0421:0433:rc + user_debug= [KNL,ARM] + Format: + See arch/arm/Kconfig.debug help text. + 1 - undefined instruction events + 2 - system calls + 4 - invalid data aborts + 8 - SIGSEGV faults + 16 - SIGBUS faults + Example: user_debug=31 + userpte= [X86] Flags controlling user PTE allocations. diff --git a/Documentation/networking/00-INDEX b/Documentation/networking/00-INDEX index 4edd78dfb362..bbce1215434a 100644 --- a/Documentation/networking/00-INDEX +++ b/Documentation/networking/00-INDEX @@ -1,13 +1,21 @@ 00-INDEX - this file +3c359.txt + - information on the 3Com TokenLink Velocity XL (3c5359) driver. 3c505.txt - information on the 3Com EtherLink Plus (3c505) driver. +3c509.txt + - information on the 3Com Etherlink III Series Ethernet cards. 6pack.txt - info on the 6pack protocol, an alternative to KISS for AX.25 DLINK.txt - info on the D-Link DE-600/DE-620 parallel port pocket adapters PLIP.txt - PLIP: The Parallel Line Internet Protocol device driver +README.ipw2100 + - README for the Intel PRO/Wireless 2100 driver. +README.ipw2200 + - README for the Intel PRO/Wireless 2915ABG and 2200BG driver. README.sb1000 - info on General Instrument/NextLevel SURFboard1000 cable modem. alias.txt @@ -20,8 +28,12 @@ atm.txt - info on where to get ATM programs and support for Linux. ax25.txt - info on using AX.25 and NET/ROM code for Linux +batman-adv.txt + - B.A.T.M.A.N routing protocol on top of layer 2 Ethernet Frames. baycom.txt - info on the driver for Baycom style amateur radio modems +bonding.txt + - Linux Ethernet Bonding Driver HOWTO: link aggregation in Linux. bridge.txt - where to get user space programs for ethernet bridging with Linux. can.txt @@ -34,32 +46,60 @@ cxacru.txt - Conexant AccessRunner USB ADSL Modem cxacru-cf.py - Conexant AccessRunner USB ADSL Modem configuration file parser +cxgb.txt + - Release Notes for the Chelsio N210 Linux device driver. +dccp.txt + - the Datagram Congestion Control Protocol (DCCP) (RFC 4340..42). de4x5.txt - the Digital EtherWORKS DE4?? and DE5?? PCI Ethernet driver decnet.txt - info on using the DECnet networking layer in Linux. depca.txt - the Digital DEPCA/EtherWORKS DE1?? and DE2?? LANCE Ethernet driver +dl2k.txt + - README for D-Link DL2000-based Gigabit Ethernet Adapters (dl2k.ko). +dm9000.txt + - README for the Simtec DM9000 Network driver. dmfe.txt - info on the Davicom DM9102(A)/DM9132/DM9801 fast ethernet driver. +dns_resolver.txt + - The DNS resolver module allows kernel servies to make DNS queries. +driver.txt + - Softnet driver issues. e100.txt - info on Intel's EtherExpress PRO/100 line of 10/100 boards e1000.txt - info on Intel's E1000 line of gigabit ethernet boards +e1000e.txt + - README for the Intel Gigabit Ethernet Driver (e1000e). eql.txt - serial IP load balancing ewrk3.txt - the Digital EtherWORKS 3 DE203/4/5 Ethernet driver +fib_trie.txt + - Level Compressed Trie (LC-trie) notes: a structure for routing. filter.txt - Linux Socket Filtering fore200e.txt - FORE Systems PCA-200E/SBA-200E ATM NIC driver info. framerelay.txt - info on using Frame Relay/Data Link Connection Identifier (DLCI). +gen_stats.txt + - Generic networking statistics for netlink users. +generic_hdlc.txt + - The generic High Level Data Link Control (HDLC) layer. generic_netlink.txt - info on Generic Netlink +gianfar.txt + - Gianfar Ethernet Driver. ieee802154.txt - Linux IEEE 802.15.4 implementation, API and drivers +ifenslave.c + - Configure network interfaces for parallel routing (bonding). +igb.txt + - README for the Intel Gigabit Ethernet Driver (igb). +igbvf.txt + - README for the Intel Gigabit Ethernet Driver (igbvf). ip-sysctl.txt - /proc/sys/net/ipv4/* variables ip_dynaddr.txt @@ -68,41 +108,117 @@ ipddp.txt - AppleTalk-IP Decapsulation and AppleTalk-IP Encapsulation iphase.txt - Interphase PCI ATM (i)Chip IA Linux driver info. +ipv6.txt + - Options to the ipv6 kernel module. +ipvs-sysctl.txt + - Per-inode explanation of the /proc/sys/net/ipv4/vs interface. irda.txt - where to get IrDA (infrared) utilities and info for Linux. +ixgb.txt + - README for the Intel 10 Gigabit Ethernet Driver (ixgb). +ixgbe.txt + - README for the Intel 10 Gigabit Ethernet Driver (ixgbe). +ixgbevf.txt + - README for the Intel Virtual Function (VF) Driver (ixgbevf). +l2tp.txt + - User guide to the L2TP tunnel protocol. lapb-module.txt - programming information of the LAPB module. ltpc.txt - the Apple or Farallon LocalTalk PC card driver +mac80211-injection.txt + - HOWTO use packet injection with mac80211 multicast.txt - Behaviour of cards under Multicast +multiqueue.txt + - HOWTO for multiqueue network device support. +netconsole.txt + - The network console module netconsole.ko: configuration and notes. +netdev-features.txt + - Network interface features API description. netdevices.txt - info on network device driver functions exported to the kernel. +netif-msg.txt + - Design of the network interface message level setting (NETIF_MSG_*). +nfc.txt + - The Linux Near Field Communication (NFS) subsystem. olympic.txt - IBM PCI Pit/Pit-Phy/Olympic Token Ring driver info. +operstates.txt + - Overview of network interface operational states. +packet_mmap.txt + - User guide to memory mapped packet socket rings (PACKET_[RT]X_RING). +phonet.txt + - The Phonet packet protocol used in Nokia cellular modems. +phy.txt + - The PHY abstraction layer. +pktgen.txt + - User guide to the kernel packet generator (pktgen.ko). policy-routing.txt - IP policy-based routing +ppp_generic.txt + - Information about the generic PPP driver. +proc_net_tcp.txt + - Per inode overview of the /proc/net/tcp and /proc/net/tcp6 interfaces. +radiotap-headers.txt + - Background on radiotap headers. ray_cs.txt - Raylink Wireless LAN card driver info. +rds.txt + - Background on the reliable, ordered datagram delivery method RDS. +regulatory.txt + - Overview of the Linux wireless regulatory infrastructure. +rxrpc.txt + - Guide to the RxRPC protocol. +s2io.txt + - Release notes for Neterion Xframe I/II 10GbE driver. +scaling.txt + - Explanation of network scaling techniques: RSS, RPS, RFS, aRFS, XPS. +sctp.txt + - Notes on the Linux kernel implementation of the SCTP protocol. +secid.txt + - Explanation of the secid member in flow structures. skfp.txt - SysKonnect FDDI (SK-5xxx, Compaq Netelligent) driver info. smc9.txt - the driver for SMC's 9000 series of Ethernet cards smctr.txt - SMC TokenCard TokenRing Linux driver info. +spider-net.txt + - README for the Spidernet Driver (as found in PS3 / Cell BE). +stmmac.txt + - README for the STMicro Synopsys Ethernet driver. +tc-actions-env-rules.txt + - rules for traffic control (tc) actions. +timestamping.txt + - overview of network packet timestamping variants. tcp.txt - short blurb on how TCP output takes place. +tcp-thin.txt + - kernel tuning options for low rate 'thin' TCP streams. tlan.txt - ThunderLAN (Compaq Netelligent 10/100, Olicom OC-2xxx) driver info. tms380tr.txt - SysKonnect Token Ring ISA/PCI adapter driver info. +tproxy.txt + - Transparent proxy support user guide. tuntap.txt - TUN/TAP device driver, allowing user space Rx/Tx of packets. +udplite.txt + - UDP-Lite protocol (RFC 3828) introduction. vortex.txt - info on using 3Com Vortex (3c590, 3c592, 3c595, 3c597) Ethernet cards. +vxge.txt + - README for the Neterion X3100 PCIe Server Adapter. x25.txt - general info on X.25 development. x25-iface.txt - description of the X.25 Packet Layer to LAPB device interface. +xfrm_proc.txt + - description of the statistics package for XFRM. +xfrm_sync.txt + - sync patches for XFRM enable migration of an SA between hosts. +xfrm_sysctl.txt + - description of the XFRM configuration options. z8530drv.txt - info about Linux driver for Z8530 based HDLC cards for AX.25 diff --git a/Documentation/networking/ip-sysctl.txt b/Documentation/networking/ip-sysctl.txt index db2a4067013c..81546990f41c 100644 --- a/Documentation/networking/ip-sysctl.txt +++ b/Documentation/networking/ip-sysctl.txt @@ -992,7 +992,7 @@ bindv6only - BOOLEAN TRUE: disable IPv4-mapped address feature FALSE: enable IPv4-mapped address feature - Default: FALSE (as specified in RFC2553bis) + Default: FALSE (as specified in RFC3493) IPv6 Fragmentation: diff --git a/Documentation/networking/scaling.txt b/Documentation/networking/scaling.txt index 7254b4b5910e..58fd7414e6c0 100644 --- a/Documentation/networking/scaling.txt +++ b/Documentation/networking/scaling.txt @@ -52,7 +52,8 @@ module parameter for specifying the number of hardware queues to configure. In the bnx2x driver, for instance, this parameter is called num_queues. A typical RSS configuration would be to have one receive queue for each CPU if the device supports enough queues, or otherwise at least -one for each cache domain at a particular cache level (L1, L2, etc.). +one for each memory domain, where a memory domain is a set of CPUs that +share a particular memory level (L1, L2, NUMA node, etc.). The indirection table of an RSS device, which resolves a queue by masked hash, is usually programmed by the driver at initialization. The @@ -82,11 +83,17 @@ RSS should be enabled when latency is a concern or whenever receive interrupt processing forms a bottleneck. Spreading load between CPUs decreases queue length. For low latency networking, the optimal setting is to allocate as many queues as there are CPUs in the system (or the -NIC maximum, if lower). Because the aggregate number of interrupts grows -with each additional queue, the most efficient high-rate configuration +NIC maximum, if lower). The most efficient high-rate configuration is likely the one with the smallest number of receive queues where no -CPU that processes receive interrupts reaches 100% utilization. Per-cpu -load can be observed using the mpstat utility. +receive queue overflows due to a saturated CPU, because in default +mode with interrupt coalescing enabled, the aggregate number of +interrupts (and thus work) grows with each additional queue. + +Per-cpu load can be observed using the mpstat utility, but note that on +processors with hyperthreading (HT), each hyperthread is represented as +a separate CPU. For interrupt handling, HT has shown no benefit in +initial tests, so limit the number of queues to the number of CPU cores +in the system. RPS: Receive Packet Steering @@ -145,7 +152,7 @@ the bitmap. == Suggested Configuration For a single queue device, a typical RPS configuration would be to set -the rps_cpus to the CPUs in the same cache domain of the interrupting +the rps_cpus to the CPUs in the same memory domain of the interrupting CPU. If NUMA locality is not an issue, this could also be all CPUs in the system. At high interrupt rate, it might be wise to exclude the interrupting CPU from the map since that already performs much work. @@ -154,7 +161,7 @@ For a multi-queue system, if RSS is configured so that a hardware receive queue is mapped to each CPU, then RPS is probably redundant and unnecessary. If there are fewer hardware queues than CPUs, then RPS might be beneficial if the rps_cpus for each queue are the ones that -share the same cache domain as the interrupting CPU for that queue. +share the same memory domain as the interrupting CPU for that queue. RFS: Receive Flow Steering @@ -326,7 +333,7 @@ The queue chosen for transmitting a particular flow is saved in the corresponding socket structure for the flow (e.g. a TCP connection). This transmit queue is used for subsequent packets sent on the flow to prevent out of order (ooo) packets. The choice also amortizes the cost -of calling get_xps_queues() over all packets in the connection. To avoid +of calling get_xps_queues() over all packets in the flow. To avoid ooo packets, the queue for a flow can subsequently only be changed if skb->ooo_okay is set for a packet in the flow. This flag indicates that there are no outstanding packets in the flow, so the transmit queue can diff --git a/Documentation/power/runtime_pm.txt b/Documentation/power/runtime_pm.txt index 4ce5450ab6e8..6066e3a6b9a9 100644 --- a/Documentation/power/runtime_pm.txt +++ b/Documentation/power/runtime_pm.txt @@ -431,8 +431,7 @@ drivers/base/power/runtime.c and include/linux/pm_runtime.h: void pm_runtime_irq_safe(struct device *dev); - set the power.irq_safe flag for the device, causing the runtime-PM - suspend and resume callbacks (but not the idle callback) to be invoked - with interrupts disabled + callbacks to be invoked with interrupts off void pm_runtime_mark_last_busy(struct device *dev); - set the power.last_busy field to the current time diff --git a/Documentation/ramoops.txt b/Documentation/ramoops.txt new file mode 100644 index 000000000000..8fb1ba7fe7bf --- /dev/null +++ b/Documentation/ramoops.txt @@ -0,0 +1,76 @@ +Ramoops oops/panic logger +========================= + +Sergiu Iordache + +Updated: 8 August 2011 + +0. Introduction + +Ramoops is an oops/panic logger that writes its logs to RAM before the system +crashes. It works by logging oopses and panics in a circular buffer. Ramoops +needs a system with persistent RAM so that the content of that area can +survive after a restart. + +1. Ramoops concepts + +Ramoops uses a predefined memory area to store the dump. The start and size of +the memory area are set using two variables: + * "mem_address" for the start + * "mem_size" for the size. The memory size will be rounded down to a + power of two. + +The memory area is divided into "record_size" chunks (also rounded down to +power of two) and each oops/panic writes a "record_size" chunk of +information. + +Dumping both oopses and panics can be done by setting 1 in the "dump_oops" +variable while setting 0 in that variable dumps only the panics. + +The module uses a counter to record multiple dumps but the counter gets reset +on restart (i.e. new dumps after the restart will overwrite old ones). + +2. Setting the parameters + +Setting the ramoops parameters can be done in 2 different manners: + 1. Use the module parameters (which have the names of the variables described + as before). + 2. Use a platform device and set the platform data. The parameters can then + be set through that platform data. An example of doing that is: + +#include +[...] + +static struct ramoops_platform_data ramoops_data = { + .mem_size = <...>, + .mem_address = <...>, + .record_size = <...>, + .dump_oops = <...>, +}; + +static struct platform_device ramoops_dev = { + .name = "ramoops", + .dev = { + .platform_data = &ramoops_data, + }, +}; + +[... inside a function ...] +int ret; + +ret = platform_device_register(&ramoops_dev); +if (ret) { + printk(KERN_ERR "unable to register platform device\n"); + return ret; +} + +3. Dump format + +The data dump begins with a header, currently defined as "====" followed by a +timestamp and a new line. The dump then continues with the actual data. + +4. Reading the data + +The dump data can be read from memory (through /dev/mem or other means). +Getting the module parameters, which are needed in order to parse the data, can +be done through /sys/module/ramoops/parameters/* . diff --git a/Documentation/virtual/00-INDEX b/Documentation/virtual/00-INDEX index fe0251c4cfb7..8e601991d91c 100644 --- a/Documentation/virtual/00-INDEX +++ b/Documentation/virtual/00-INDEX @@ -8,3 +8,6 @@ lguest/ - Extremely simple hypervisor for experimental/educational use. uml/ - User Mode Linux, builds/runs Linux kernel as a userspace program. +virtio.txt + - Text version of draft virtio spec. + See http://ozlabs.org/~rusty/virtio-spec diff --git a/Documentation/virtual/lguest/lguest.c b/Documentation/virtual/lguest/lguest.c index 043bd7df3139..d928c134dee6 100644 --- a/Documentation/virtual/lguest/lguest.c +++ b/Documentation/virtual/lguest/lguest.c @@ -1996,6 +1996,9 @@ int main(int argc, char *argv[]) /* We use a simple helper to copy the arguments separated by spaces. */ concat((char *)(boot + 1), argv+optind+2); + /* Set kernel alignment to 16M (CONFIG_PHYSICAL_ALIGN) */ + boot->hdr.kernel_alignment = 0x1000000; + /* Boot protocol version: 2.07 supports the fields for lguest. */ boot->hdr.version = 0x207; diff --git a/Documentation/virtual/virtio-spec.txt b/Documentation/virtual/virtio-spec.txt new file mode 100644 index 000000000000..a350ae135b8c --- /dev/null +++ b/Documentation/virtual/virtio-spec.txt @@ -0,0 +1,2200 @@ +[Generated file: see http://ozlabs.org/~rusty/virtio-spec/] +Virtio PCI Card Specification +v0.9.1 DRAFT +- + +Rusty Russell IBM Corporation (Editor) + +2011 August 1. + +Purpose and Description + +This document describes the specifications of the “virtio” family +of PCI[LaTeX Command: nomenclature] devices. These are devices +are found in virtual environments[LaTeX Command: nomenclature], +yet by design they are not all that different from physical PCI +devices, and this document treats them as such. This allows the +guest to use standard PCI drivers and discovery mechanisms. + +The purpose of virtio and this specification is that virtual +environments and guests should have a straightforward, efficient, +standard and extensible mechanism for virtual devices, rather +than boutique per-environment or per-OS mechanisms. + + Straightforward: Virtio PCI devices use normal PCI mechanisms + of interrupts and DMA which should be familiar to any device + driver author. There is no exotic page-flipping or COW + mechanism: it's just a PCI device.[footnote: +This lack of page-sharing implies that the implementation of the +device (e.g. the hypervisor or host) needs full access to the +guest memory. Communication with untrusted parties (i.e. +inter-guest communication) requires copying. +] + + Efficient: Virtio PCI devices consist of rings of descriptors + for input and output, which are neatly separated to avoid cache + effects from both guest and device writing to the same cache + lines. + + Standard: Virtio PCI makes no assumptions about the environment + in which it operates, beyond supporting PCI. In fact the virtio + devices specified in the appendices do not require PCI at all: + they have been implemented on non-PCI buses.[footnote: +The Linux implementation further separates the PCI virtio code +from the specific virtio drivers: these drivers are shared with +the non-PCI implementations (currently lguest and S/390). +] + + Extensible: Virtio PCI devices contain feature bits which are + acknowledged by the guest operating system during device setup. + This allows forwards and backwards compatibility: the device + offers all the features it knows about, and the driver + acknowledges those it understands and wishes to use. + + Virtqueues + +The mechanism for bulk data transport on virtio PCI devices is +pretentiously called a virtqueue. Each device can have zero or +more virtqueues: for example, the network device has one for +transmit and one for receive. + +Each virtqueue occupies two or more physically-contiguous pages +(defined, for the purposes of this specification, as 4096 bytes), +and consists of three parts: + + ++-------------------+-----------------------------------+-----------+ +| Descriptor Table | Available Ring (padding) | Used Ring | ++-------------------+-----------------------------------+-----------+ + + +When the driver wants to send buffers to the device, it puts them +in one or more slots in the descriptor table, and writes the +descriptor indices into the available ring. It then notifies the +device. When the device has finished with the buffers, it writes +the descriptors into the used ring, and sends an interrupt. + +Specification + + PCI Discovery + +Any PCI device with Vendor ID 0x1AF4, and Device ID 0x1000 +through 0x103F inclusive is a virtio device[footnote: +The actual value within this range is ignored +]. The device must also have a Revision ID of 0 to match this +specification. + +The Subsystem Device ID indicates which virtio device is +supported by the device. The Subsystem Vendor ID should reflect +the PCI Vendor ID of the environment (it's currently only used +for informational purposes by the guest). + + ++----------------------+--------------------+---------------+ +| Subsystem Device ID | Virtio Device | Specification | ++----------------------+--------------------+---------------+ ++----------------------+--------------------+---------------+ +| 1 | network card | Appendix C | ++----------------------+--------------------+---------------+ +| 2 | block device | Appendix D | ++----------------------+--------------------+---------------+ +| 3 | console | Appendix E | ++----------------------+--------------------+---------------+ +| 4 | entropy source | Appendix F | ++----------------------+--------------------+---------------+ +| 5 | memory ballooning | Appendix G | ++----------------------+--------------------+---------------+ +| 6 | ioMemory | - | ++----------------------+--------------------+---------------+ +| 9 | 9P transport | - | ++----------------------+--------------------+---------------+ + + + Device Configuration + +To configure the device, we use the first I/O region of the PCI +device. This contains a virtio header followed by a +device-specific region. + +There may be different widths of accesses to the I/O region; the “ +natural” access method for each field in the virtio header must +be used (i.e. 32-bit accesses for 32-bit fields, etc), but the +device-specific region can be accessed using any width accesses, +and should obtain the same results. + +Note that this is possible because while the virtio header is PCI +(i.e. little) endian, the device-specific region is encoded in +the native endian of the guest (where such distinction is +applicable). + + Device Initialization Sequence + +We start with an overview of device initialization, then expand +on the details of the device and how each step is preformed. + + Reset the device. This is not required on initial start up. + + The ACKNOWLEDGE status bit is set: we have noticed the device. + + The DRIVER status bit is set: we know how to drive the device. + + Device-specific setup, including reading the Device Feature + Bits, discovery of virtqueues for the device, optional MSI-X + setup, and reading and possibly writing the virtio + configuration space. + + The subset of Device Feature Bits understood by the driver is + written to the device. + + The DRIVER_OK status bit is set. + + The device can now be used (ie. buffers added to the + virtqueues)[footnote: +Historically, drivers have used the device before steps 5 and 6. +This is only allowed if the driver does not use any features +which would alter this early use of the device. +] + +If any of these steps go irrecoverably wrong, the guest should +set the FAILED status bit to indicate that it has given up on the +device (it can reset the device later to restart if desired). + +We now cover the fields required for general setup in detail. + + Virtio Header + +The virtio header looks as follows: + + ++------------++---------------------+---------------------+----------+--------+---------+---------+---------+--------+ +| Bits || 32 | 32 | 32 | 16 | 16 | 16 | 8 | 8 | ++------------++---------------------+---------------------+----------+--------+---------+---------+---------+--------+ +| Read/Write || R | R+W | R+W | R | R+W | R+W | R+W | R | ++------------++---------------------+---------------------+----------+--------+---------+---------+---------+--------+ +| Purpose || Device | Guest | Queue | Queue | Queue | Queue | Device | ISR | +| || Features bits 0:31 | Features bits 0:31 | Address | Size | Select | Notify | Status | Status | ++------------++---------------------+---------------------+----------+--------+---------+---------+---------+--------+ + + +If MSI-X is enabled for the device, two additional fields +immediately follow this header: + + ++------------++----------------+--------+ +| Bits || 16 | 16 | + +----------------+--------+ ++------------++----------------+--------+ +| Read/Write || R+W | R+W | ++------------++----------------+--------+ +| Purpose || Configuration | Queue | +| (MSI-X) || Vector | Vector | ++------------++----------------+--------+ + + +Finally, if feature bits (VIRTIO_F_FEATURES_HI) this is +immediately followed by two additional fields: + + ++------------++----------------------+---------------------- +| Bits || 32 | 32 ++------------++----------------------+---------------------- +| Read/Write || R | R+W ++------------++----------------------+---------------------- +| Purpose || Device | Guest +| || Features bits 32:63 | Features bits 32:63 ++------------++----------------------+---------------------- + + +Immediately following these general headers, there may be +device-specific headers: + + ++------------++--------------------+ +| Bits || Device Specific | + +--------------------+ ++------------++--------------------+ +| Read/Write || Device Specific | ++------------++--------------------+ +| Purpose || Device Specific... | +| || | ++------------++--------------------+ + + + Device Status + +The Device Status field is updated by the guest to indicate its +progress. This provides a simple low-level diagnostic: it's most +useful to imagine them hooked up to traffic lights on the console +indicating the status of each device. + +The device can be reset by writing a 0 to this field, otherwise +at least one bit should be set: + + ACKNOWLEDGE (1) Indicates that the guest OS has found the + device and recognized it as a valid virtio device. + + DRIVER (2) Indicates that the guest OS knows how to drive the + device. Under Linux, drivers can be loadable modules so there + may be a significant (or infinite) delay before setting this + bit. + + DRIVER_OK (3) Indicates that the driver is set up and ready to + drive the device. + + FAILED (8) Indicates that something went wrong in the guest, + and it has given up on the device. This could be an internal + error, or the driver didn't like the device for some reason, or + even a fatal error during device operation. The device must be + reset before attempting to re-initialize. + + Feature Bits + +The least significant 31 bits of the first configuration field +indicates the features that the device supports (the high bit is +reserved, and will be used to indicate the presence of future +feature bits elsewhere). If more than 31 feature bits are +supported, the device indicates so by setting feature bit 31 (see +[cha:Reserved-Feature-Bits]). The bits are allocated as follows: + + 0 to 23 Feature bits for the specific device type + + 24 to 40 Feature bits reserved for extensions to the queue and + feature negotiation mechanisms + + 41 to 63 Feature bits reserved for future extensions + +For example, feature bit 0 for a network device (i.e. Subsystem +Device ID 1) indicates that the device supports checksumming of +packets. + +The feature bits are negotiated: the device lists all the +features it understands in the Device Features field, and the +guest writes the subset that it understands into the Guest +Features field. The only way to renegotiate is to reset the +device. + +In particular, new fields in the device configuration header are +indicated by offering a feature bit, so the guest can check +before accessing that part of the configuration space. + +This allows for forwards and backwards compatibility: if the +device is enhanced with a new feature bit, older guests will not +write that feature bit back to the Guest Features field and it +can go into backwards compatibility mode. Similarly, if a guest +is enhanced with a feature that the device doesn't support, it +will not see that feature bit in the Device Features field and +can go into backwards compatibility mode (or, for poor +implementations, set the FAILED Device Status bit). + +Access to feature bits 32 to 63 is enabled by Guest by setting +feature bit 31. If this bit is unset, Device must assume that all +feature bits > 31 are unset. + + Configuration/Queue Vectors + +When MSI-X capability is present and enabled in the device +(through standard PCI configuration space) 4 bytes at byte offset +20 are used to map configuration change and queue interrupts to +MSI-X vectors. In this case, the ISR Status field is unused, and +device specific configuration starts at byte offset 24 in virtio +header structure. When MSI-X capability is not enabled, device +specific configuration starts at byte offset 20 in virtio header. + +Writing a valid MSI-X Table entry number, 0 to 0x7FF, to one of +Configuration/Queue Vector registers, maps interrupts triggered +by the configuration change/selected queue events respectively to +the corresponding MSI-X vector. To disable interrupts for a +specific event type, unmap it by writing a special NO_VECTOR +value: + +/* Vector value used to disable MSI for queue */ + +#define VIRTIO_MSI_NO_VECTOR 0xffff + +Reading these registers returns vector mapped to a given event, +or NO_VECTOR if unmapped. All queue and configuration change +events are unmapped by default. + +Note that mapping an event to vector might require allocating +internal device resources, and might fail. Devices report such +failures by returning the NO_VECTOR value when the relevant +Vector field is read. After mapping an event to vector, the +driver must verify success by reading the Vector field value: on +success, the previously written value is returned, and on +failure, NO_VECTOR is returned. If a mapping failure is detected, +the driver can retry mapping with fewervectors, or disable MSI-X. + + Virtqueue Configuration + +As a device can have zero or more virtqueues for bulk data +transport (for example, the network driver has two), the driver +needs to configure them as part of the device-specific +configuration. + +This is done as follows, for each virtqueue a device has: + + Write the virtqueue index (first queue is 0) to the Queue + Select field. + + Read the virtqueue size from the Queue Size field, which is + always a power of 2. This controls how big the virtqueue is + (see below). If this field is 0, the virtqueue does not exist. + + Allocate and zero virtqueue in contiguous physical memory, on a + 4096 byte alignment. Write the physical address, divided by + 4096 to the Queue Address field.[footnote: +The 4096 is based on the x86 page size, but it's also large +enough to ensure that the separate parts of the virtqueue are on +separate cache lines. +] + + Optionally, if MSI-X capability is present and enabled on the + device, select a vector to use to request interrupts triggered + by virtqueue events. Write the MSI-X Table entry number + corresponding to this vector in Queue Vector field. Read the + Queue Vector field: on success, previously written value is + returned; on failure, NO_VECTOR value is returned. + +The Queue Size field controls the total number of bytes required +for the virtqueue according to the following formula: + +#define ALIGN(x) (((x) + 4095) & ~4095) + +static inline unsigned vring_size(unsigned int qsz) + +{ + + return ALIGN(sizeof(struct vring_desc)*qsz + sizeof(u16)*(2 ++ qsz)) + + + ALIGN(sizeof(struct vring_used_elem)*qsz); + +} + +This currently wastes some space with padding, but also allows +future extensions. The virtqueue layout structure looks like this +(qsz is the Queue Size field, which is a variable, so this code +won't compile): + +struct vring { + + /* The actual descriptors (16 bytes each) */ + + struct vring_desc desc[qsz]; + + + + /* A ring of available descriptor heads with free-running +index. */ + + struct vring_avail avail; + + + + // Padding to the next 4096 boundary. + + char pad[]; + + + + // A ring of used descriptor heads with free-running index. + + struct vring_used used; + +}; + + A Note on Virtqueue Endianness + +Note that the endian of these fields and everything else in the +virtqueue is the native endian of the guest, not little-endian as +PCI normally is. This makes for simpler guest code, and it is +assumed that the host already has to be deeply aware of the guest +endian so such an “endian-aware” device is not a significant +issue. + + Descriptor Table + +The descriptor table refers to the buffers the guest is using for +the device. The addresses are physical addresses, and the buffers +can be chained via the next field. Each descriptor describes a +buffer which is read-only or write-only, but a chain of +descriptors can contain both read-only and write-only buffers. + +No descriptor chain may be more than 2^32 bytes long in total.struct vring_desc { + + /* Address (guest-physical). */ + + u64 addr; + + /* Length. */ + + u32 len; + +/* This marks a buffer as continuing via the next field. */ + +#define VRING_DESC_F_NEXT 1 + +/* This marks a buffer as write-only (otherwise read-only). */ + +#define VRING_DESC_F_WRITE 2 + +/* This means the buffer contains a list of buffer descriptors. +*/ + +#define VRING_DESC_F_INDIRECT 4 + + /* The flags as indicated above. */ + + u16 flags; + + /* Next field if flags & NEXT */ + + u16 next; + +}; + +The number of descriptors in the table is specified by the Queue +Size field for this virtqueue. + + Indirect Descriptors + +Some devices benefit by concurrently dispatching a large number +of large requests. The VIRTIO_RING_F_INDIRECT_DESC feature can be +used to allow this (see [cha:Reserved-Feature-Bits]). To increase +ring capacity it is possible to store a table of indirect +descriptors anywhere in memory, and insert a descriptor in main +virtqueue (with flags&INDIRECT on) that refers to memory buffer +containing this indirect descriptor table; fields addr and len +refer to the indirect table address and length in bytes, +respectively. The indirect table layout structure looks like this +(len is the length of the descriptor that refers to this table, +which is a variable, so this code won't compile): + +struct indirect_descriptor_table { + + /* The actual descriptors (16 bytes each) */ + + struct vring_desc desc[len / 16]; + +}; + +The first indirect descriptor is located at start of the indirect +descriptor table (index 0), additional indirect descriptors are +chained by next field. An indirect descriptor without next field +(with flags&NEXT off) signals the end of the indirect descriptor +table, and transfers control back to the main virtqueue. An +indirect descriptor can not refer to another indirect descriptor +table (flags&INDIRECT must be off). A single indirect descriptor +table can include both read-only and write-only descriptors; +write-only flag (flags&WRITE) in the descriptor that refers to it +is ignored. + + Available Ring + +The available ring refers to what descriptors we are offering the +device: it refers to the head of a descriptor chain. The “flags” +field is currently 0 or 1: 1 indicating that we do not need an +interrupt when the device consumes a descriptor from the +available ring. Alternatively, the guest can ask the device to +delay interrupts until an entry with an index specified by the “ +used_event” field is written in the used ring (equivalently, +until the idx field in the used ring will reach the value +used_event + 1). The method employed by the device is controlled +by the VIRTIO_RING_F_EVENT_IDX feature bit (see [cha:Reserved-Feature-Bits] +). This interrupt suppression is merely an optimization; it may +not suppress interrupts entirely. + +The “idx” field indicates where we would put the next descriptor +entry (modulo the ring size). This starts at 0, and increases. + +struct vring_avail { + +#define VRING_AVAIL_F_NO_INTERRUPT 1 + + u16 flags; + + u16 idx; + + u16 ring[qsz]; /* qsz is the Queue Size field read from device +*/ + + u16 used_event; + +}; + + Used Ring + +The used ring is where the device returns buffers once it is done +with them. The flags field can be used by the device to hint that +no notification is necessary when the guest adds to the available +ring. Alternatively, the “avail_event” field can be used by the +device to hint that no notification is necessary until an entry +with an index specified by the “avail_event” is written in the +available ring (equivalently, until the idx field in the +available ring will reach the value avail_event + 1). The method +employed by the device is controlled by the guest through the +VIRTIO_RING_F_EVENT_IDX feature bit (see [cha:Reserved-Feature-Bits] +). [footnote: +These fields are kept here because this is the only part of the +virtqueue written by the device +]. + +Each entry in the ring is a pair: the head entry of the +descriptor chain describing the buffer (this matches an entry +placed in the available ring by the guest earlier), and the total +of bytes written into the buffer. The latter is extremely useful +for guests using untrusted buffers: if you do not know exactly +how much has been written by the device, you usually have to zero +the buffer to ensure no data leakage occurs. + +/* u32 is used here for ids for padding reasons. */ + +struct vring_used_elem { + + /* Index of start of used descriptor chain. */ + + u32 id; + + /* Total length of the descriptor chain which was used +(written to) */ + + u32 len; + +}; + + + +struct vring_used { + +#define VRING_USED_F_NO_NOTIFY 1 + + u16 flags; + + u16 idx; + + struct vring_used_elem ring[qsz]; + + u16 avail_event; + +}; + + Helpers for Managing Virtqueues + +The Linux Kernel Source code contains the definitions above and +helper routines in a more usable form, in +include/linux/virtio_ring.h. This was explicitly licensed by IBM +and Red Hat under the (3-clause) BSD license so that it can be +freely used by all other projects, and is reproduced (with slight +variation to remove Linux assumptions) in Appendix A. + + Device Operation + +There are two parts to device operation: supplying new buffers to +the device, and processing used buffers from the device. As an +example, the virtio network device has two virtqueues: the +transmit virtqueue and the receive virtqueue. The driver adds +outgoing (read-only) packets to the transmit virtqueue, and then +frees them after they are used. Similarly, incoming (write-only) +buffers are added to the receive virtqueue, and processed after +they are used. + + Supplying Buffers to The Device + +Actual transfer of buffers from the guest OS to the device +operates as follows: + + Place the buffer(s) into free descriptor(s). + + If there are no free descriptors, the guest may choose to + notify the device even if notifications are suppressed (to + reduce latency).[footnote: +The Linux drivers do this only for read-only buffers: for +write-only buffers, it is assumed that the driver is merely +trying to keep the receive buffer ring full, and no notification +of this expected condition is necessary. +] + + Place the id of the buffer in the next ring entry of the + available ring. + + The steps (1) and (2) may be performed repeatedly if batching + is possible. + + A memory barrier should be executed to ensure the device sees + the updated descriptor table and available ring before the next + step. + + The available “idx” field should be increased by the number of + entries added to the available ring. + + A memory barrier should be executed to ensure that we update + the idx field before checking for notification suppression. + + If notifications are not suppressed, the device should be + notified of the new buffers. + +Note that the above code does not take precautions against the +available ring buffer wrapping around: this is not possible since +the ring buffer is the same size as the descriptor table, so step +(1) will prevent such a condition. + +In addition, the maximum queue size is 32768 (it must be a power +of 2 which fits in 16 bits), so the 16-bit “idx” value can always +distinguish between a full and empty buffer. + +Here is a description of each stage in more detail. + + Placing Buffers Into The Descriptor Table + +A buffer consists of zero or more read-only physically-contiguous +elements followed by zero or more physically-contiguous +write-only elements (it must have at least one element). This +algorithm maps it into the descriptor table: + + for each buffer element, b: + + Get the next free descriptor table entry, d + + Set d.addr to the physical address of the start of b + + Set d.len to the length of b. + + If b is write-only, set d.flags to VRING_DESC_F_WRITE, + otherwise 0. + + If there is a buffer element after this: + + Set d.next to the index of the next free descriptor element. + + Set the VRING_DESC_F_NEXT bit in d.flags. + +In practice, the d.next fields are usually used to chain free +descriptors, and a separate count kept to check there are enough +free descriptors before beginning the mappings. + + Updating The Available Ring + +The head of the buffer we mapped is the first d in the algorithm +above. A naive implementation would do the following: + +avail->ring[avail->idx % qsz] = head; + +However, in general we can add many descriptors before we update +the “idx” field (at which point they become visible to the +device), so we keep a counter of how many we've added: + +avail->ring[(avail->idx + added++) % qsz] = head; + + Updating The Index Field + +Once the idx field of the virtqueue is updated, the device will +be able to access the descriptor entries we've created and the +memory they refer to. This is why a memory barrier is generally +used before the idx update, to ensure it sees the most up-to-date +copy. + +The idx field always increments, and we let it wrap naturally at +65536: + +avail->idx += added; + + Notifying The Device + +Device notification occurs by writing the 16-bit virtqueue index +of this virtqueue to the Queue Notify field of the virtio header +in the first I/O region of the PCI device. This can be expensive, +however, so the device can suppress such notifications if it +doesn't need them. We have to be careful to expose the new idx +value before checking the suppression flag: it's OK to notify +gratuitously, but not to omit a required notification. So again, +we use a memory barrier here before reading the flags or the +avail_event field. + +If the VIRTIO_F_RING_EVENT_IDX feature is not negotiated, and if +the VRING_USED_F_NOTIFY flag is not set, we go ahead and write to +the PCI configuration space. + +If the VIRTIO_F_RING_EVENT_IDX feature is negotiated, we read the +avail_event field in the available ring structure. If the +available index crossed_the avail_event field value since the +last notification, we go ahead and write to the PCI configuration +space. The avail_event field wraps naturally at 65536 as well: + +(u16)(new_idx - avail_event - 1) < (u16)(new_idx - old_idx) + + Receiving Used Buffers From The + Device + +Once the device has used a buffer (read from or written to it, or +parts of both, depending on the nature of the virtqueue and the +device), it sends an interrupt, following an algorithm very +similar to the algorithm used for the driver to send the device a +buffer: + + Write the head descriptor number to the next field in the used + ring. + + Update the used ring idx. + + Determine whether an interrupt is necessary: + + If the VIRTIO_F_RING_EVENT_IDX feature is not negotiated: check + if f the VRING_AVAIL_F_NO_INTERRUPT flag is not set in avail- + >flags + + If the VIRTIO_F_RING_EVENT_IDX feature is negotiated: check + whether the used index crossed the used_event field value + since the last update. The used_event field wraps naturally + at 65536 as well:(u16)(new_idx - used_event - 1) < (u16)(new_idx - old_idx) + + If an interrupt is necessary: + + If MSI-X capability is disabled: + + Set the lower bit of the ISR Status field for the device. + + Send the appropriate PCI interrupt for the device. + + If MSI-X capability is enabled: + + Request the appropriate MSI-X interrupt message for the + device, Queue Vector field sets the MSI-X Table entry + number. + + If Queue Vector field value is NO_VECTOR, no interrupt + message is requested for this event. + +The guest interrupt handler should: + + If MSI-X capability is disabled: read the ISR Status field, + which will reset it to zero. If the lower bit is zero, the + interrupt was not for this device. Otherwise, the guest driver + should look through the used rings of each virtqueue for the + device, to see if any progress has been made by the device + which requires servicing. + + If MSI-X capability is enabled: look through the used rings of + each virtqueue mapped to the specific MSI-X vector for the + device, to see if any progress has been made by the device + which requires servicing. + +For each ring, guest should then disable interrupts by writing +VRING_AVAIL_F_NO_INTERRUPT flag in avail structure, if required. +It can then process used ring entries finally enabling interrupts +by clearing the VRING_AVAIL_F_NO_INTERRUPT flag or updating the +EVENT_IDX field in the available structure, Guest should then +execute a memory barrier, and then recheck the ring empty +condition. This is necessary to handle the case where, after the +last check and before enabling interrupts, an interrupt has been +suppressed by the device: + +vring_disable_interrupts(vq); + +for (;;) { + + if (vq->last_seen_used != vring->used.idx) { + + vring_enable_interrupts(vq); + + mb(); + + if (vq->last_seen_used != vring->used.idx) + + break; + + } + + struct vring_used_elem *e = +vring.used->ring[vq->last_seen_used%vsz]; + + process_buffer(e); + + vq->last_seen_used++; + +} + + Dealing With Configuration Changes + +Some virtio PCI devices can change the device configuration +state, as reflected in the virtio header in the PCI configuration +space. In this case: + + If MSI-X capability is disabled: an interrupt is delivered and + the second highest bit is set in the ISR Status field to + indicate that the driver should re-examine the configuration + space.Note that a single interrupt can indicate both that one + or more virtqueue has been used and that the configuration + space has changed: even if the config bit is set, virtqueues + must be scanned. + + If MSI-X capability is enabled: an interrupt message is + requested. The Configuration Vector field sets the MSI-X Table + entry number to use. If Configuration Vector field value is + NO_VECTOR, no interrupt message is requested for this event. + +Creating New Device Types + +Various considerations are necessary when creating a new device +type: + + How Many Virtqueues? + +It is possible that a very simple device will operate entirely +through its configuration space, but most will need at least one +virtqueue in which it will place requests. A device with both +input and output (eg. console and network devices described here) +need two queues: one which the driver fills with buffers to +receive input, and one which the driver places buffers to +transmit output. + + What Configuration Space Layout? + +Configuration space is generally used for rarely-changing or +initialization-time parameters. But it is a limited resource, so +it might be better to use a virtqueue to update configuration +information (the network device does this for filtering, +otherwise the table in the config space could potentially be very +large). + +Note that this space is generally the guest's native endian, +rather than PCI's little-endian. + + What Device Number? + +Currently device numbers are assigned quite freely: a simple +request mail to the author of this document or the Linux +virtualization mailing list[footnote: + +https://lists.linux-foundation.org/mailman/listinfo/virtualization +] will be sufficient to secure a unique one. + +Meanwhile for experimental drivers, use 65535 and work backwards. + + How many MSI-X vectors? + +Using the optional MSI-X capability devices can speed up +interrupt processing by removing the need to read ISR Status +register by guest driver (which might be an expensive operation), +reducing interrupt sharing between devices and queues within the +device, and handling interrupts from multiple CPUs. However, some +systems impose a limit (which might be as low as 256) on the +total number of MSI-X vectors that can be allocated to all +devices. Devices and/or device drivers should take this into +account, limiting the number of vectors used unless the device is +expected to cause a high volume of interrupts. Devices can +control the number of vectors used by limiting the MSI-X Table +Size or not presenting MSI-X capability in PCI configuration +space. Drivers can control this by mapping events to as small +number of vectors as possible, or disabling MSI-X capability +altogether. + + Message Framing + +The descriptors used for a buffer should not effect the semantics +of the message, except for the total length of the buffer. For +example, a network buffer consists of a 10 byte header followed +by the network packet. Whether this is presented in the ring +descriptor chain as (say) a 10 byte buffer and a 1514 byte +buffer, or a single 1524 byte buffer, or even three buffers, +should have no effect. + +In particular, no implementation should use the descriptor +boundaries to determine the size of any header in a request.[footnote: +The current qemu device implementations mistakenly insist that +the first descriptor cover the header in these cases exactly, so +a cautious driver should arrange it so. +] + + Device Improvements + +Any change to configuration space, or new virtqueues, or +behavioural changes, should be indicated by negotiation of a new +feature bit. This establishes clarity[footnote: +Even if it does mean documenting design or implementation +mistakes! +] and avoids future expansion problems. + +Clusters of functionality which are always implemented together +can use a single bit, but if one feature makes sense without the +others they should not be gratuitously grouped together to +conserve feature bits. We can always extend the spec when the +first person needs more than 24 feature bits for their device. + +[LaTeX Command: printnomenclature] + +Appendix A: virtio_ring.h + +#ifndef VIRTIO_RING_H + +#define VIRTIO_RING_H + +/* An interface for efficient virtio implementation. + + * + + * This header is BSD licensed so anyone can use the definitions + + * to implement compatible drivers/servers. + + * + + * Copyright 2007, 2009, IBM Corporation + + * Copyright 2011, Red Hat, Inc + + * All rights reserved. + + * + + * Redistribution and use in source and binary forms, with or +without + + * modification, are permitted provided that the following +conditions + + * are met: + + * 1. Redistributions of source code must retain the above +copyright + + * notice, this list of conditions and the following +disclaimer. + + * 2. Redistributions in binary form must reproduce the above +copyright + + * notice, this list of conditions and the following +disclaimer in the + + * documentation and/or other materials provided with the +distribution. + + * 3. Neither the name of IBM nor the names of its contributors + + * may be used to endorse or promote products derived from +this software + + * without specific prior written permission. + + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND +CONTRIBUTORS ``AS IS'' AND + + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED +TO, THE + + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A +PARTICULAR PURPOSE + + * ARE DISCLAIMED. IN NO EVENT SHALL IBM OR CONTRIBUTORS BE +LIABLE + + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR +CONSEQUENTIAL + + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF +SUBSTITUTE GOODS + + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS +INTERRUPTION) + + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN +CONTRACT, STRICT + + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING +IN ANY WAY + + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE +POSSIBILITY OF + + * SUCH DAMAGE. + + */ + + + +/* This marks a buffer as continuing via the next field. */ + +#define VRING_DESC_F_NEXT 1 + +/* This marks a buffer as write-only (otherwise read-only). */ + +#define VRING_DESC_F_WRITE 2 + + + +/* The Host uses this in used->flags to advise the Guest: don't +kick me + + * when you add a buffer. It's unreliable, so it's simply an + + * optimization. Guest will still kick if it's out of buffers. +*/ + +#define VRING_USED_F_NO_NOTIFY 1 + +/* The Guest uses this in avail->flags to advise the Host: don't + + * interrupt me when you consume a buffer. It's unreliable, so +it's + + * simply an optimization. */ + +#define VRING_AVAIL_F_NO_INTERRUPT 1 + + + +/* Virtio ring descriptors: 16 bytes. + + * These can chain together via "next". */ + +struct vring_desc { + + /* Address (guest-physical). */ + + uint64_t addr; + + /* Length. */ + + uint32_t len; + + /* The flags as indicated above. */ + + uint16_t flags; + + /* We chain unused descriptors via this, too */ + + uint16_t next; + +}; + + + +struct vring_avail { + + uint16_t flags; + + uint16_t idx; + + uint16_t ring[]; + + uint16_t used_event; + +}; + + + +/* u32 is used here for ids for padding reasons. */ + +struct vring_used_elem { + + /* Index of start of used descriptor chain. */ + + uint32_t id; + + /* Total length of the descriptor chain which was written +to. */ + + uint32_t len; + +}; + + + +struct vring_used { + + uint16_t flags; + + uint16_t idx; + + struct vring_used_elem ring[]; + + uint16_t avail_event; + +}; + + + +struct vring { + + unsigned int num; + + + + struct vring_desc *desc; + + struct vring_avail *avail; + + struct vring_used *used; + +}; + + + +/* The standard layout for the ring is a continuous chunk of +memory which + + * looks like this. We assume num is a power of 2. + + * + + * struct vring { + + * // The actual descriptors (16 bytes each) + + * struct vring_desc desc[num]; + + * + + * // A ring of available descriptor heads with free-running +index. + + * __u16 avail_flags; + + * __u16 avail_idx; + + * __u16 available[num]; + + * + + * // Padding to the next align boundary. + + * char pad[]; + + * + + * // A ring of used descriptor heads with free-running +index. + + * __u16 used_flags; + + * __u16 EVENT_IDX; + + * struct vring_used_elem used[num]; + + * }; + + * Note: for virtio PCI, align is 4096. + + */ + +static inline void vring_init(struct vring *vr, unsigned int num, +void *p, + + unsigned long align) + +{ + + vr->num = num; + + vr->desc = p; + + vr->avail = p + num*sizeof(struct vring_desc); + + vr->used = (void *)(((unsigned long)&vr->avail->ring[num] + + + align-1) + + & ~(align - 1)); + +} + + + +static inline unsigned vring_size(unsigned int num, unsigned long +align) + +{ + + return ((sizeof(struct vring_desc)*num + +sizeof(uint16_t)*(2+num) + + + align - 1) & ~(align - 1)) + + + sizeof(uint16_t)*3 + sizeof(struct +vring_used_elem)*num; + +} + + + +static inline int vring_need_event(uint16_t event_idx, uint16_t +new_idx, uint16_t old_idx) + +{ + + return (uint16_t)(new_idx - event_idx - 1) < +(uint16_t)(new_idx - old_idx); + +} + +#endif /* VIRTIO_RING_H */ + +Appendix B: Reserved Feature Bits + +Currently there are five device-independent feature bits defined: + + VIRTIO_F_NOTIFY_ON_EMPTY (24) Negotiating this feature + indicates that the driver wants an interrupt if the device runs + out of available descriptors on a virtqueue, even though + interrupts are suppressed using the VRING_AVAIL_F_NO_INTERRUPT + flag or the used_event field. An example of this is the + networking driver: it doesn't need to know every time a packet + is transmitted, but it does need to free the transmitted + packets a finite time after they are transmitted. It can avoid + using a timer if the device interrupts it when all the packets + are transmitted. + + VIRTIO_F_RING_INDIRECT_DESC (28) Negotiating this feature + indicates that the driver can use descriptors with the + VRING_DESC_F_INDIRECT flag set, as described in [sub:Indirect-Descriptors] + . + + VIRTIO_F_RING_EVENT_IDX(29) This feature enables the used_event + and the avail_event fields. If set, it indicates that the + device should ignore the flags field in the available ring + structure. Instead, the used_event field in this structure is + used by guest to suppress device interrupts. Further, the + driver should ignore the flags field in the used ring + structure. Instead, the avail_event field in this structure is + used by the device to suppress notifications. If unset, the + driver should ignore the used_event field; the device should + ignore the avail_event field; the flags field is used + + VIRTIO_F_BAD_FEATURE(30) This feature should never be + negotiated by the guest; doing so is an indication that the + guest is faulty[footnote: +An experimental virtio PCI driver contained in Linux version +2.6.25 had this problem, and this feature bit can be used to +detect it. +] + + VIRTIO_F_FEATURES_HIGH(31) This feature indicates that the + device supports feature bits 32:63. If unset, feature bits + 32:63 are unset. + +Appendix C: Network Device + +The virtio network device is a virtual ethernet card, and is the +most complex of the devices supported so far by virtio. It has +enhanced rapidly and demonstrates clearly how support for new +features should be added to an existing device. Empty buffers are +placed in one virtqueue for receiving packets, and outgoing +packets are enqueued into another for transmission in that order. +A third command queue is used to control advanced filtering +features. + + Configuration + + Subsystem Device ID 1 + + Virtqueues 0:receiveq. 1:transmitq. 2:controlq[footnote: +Only if VIRTIO_NET_F_CTRL_VQ set +] + + Feature bits + + VIRTIO_NET_F_CSUM (0) Device handles packets with partial + checksum + + VIRTIO_NET_F_GUEST_CSUM (1) Guest handles packets with partial + checksum + + VIRTIO_NET_F_MAC (5) Device has given MAC address. + + VIRTIO_NET_F_GSO (6) (Deprecated) device handles packets with + any GSO type.[footnote: +It was supposed to indicate segmentation offload support, but +upon further investigation it became clear that multiple bits +were required. +] + + VIRTIO_NET_F_GUEST_TSO4 (7) Guest can receive TSOv4. + + VIRTIO_NET_F_GUEST_TSO6 (8) Guest can receive TSOv6. + + VIRTIO_NET_F_GUEST_ECN (9) Guest can receive TSO with ECN. + + VIRTIO_NET_F_GUEST_UFO (10) Guest can receive UFO. + + VIRTIO_NET_F_HOST_TSO4 (11) Device can receive TSOv4. + + VIRTIO_NET_F_HOST_TSO6 (12) Device can receive TSOv6. + + VIRTIO_NET_F_HOST_ECN (13) Device can receive TSO with ECN. + + VIRTIO_NET_F_HOST_UFO (14) Device can receive UFO. + + VIRTIO_NET_F_MRG_RXBUF (15) Guest can merge receive buffers. + + VIRTIO_NET_F_STATUS (16) Configuration status field is + available. + + VIRTIO_NET_F_CTRL_VQ (17) Control channel is available. + + VIRTIO_NET_F_CTRL_RX (18) Control channel RX mode support. + + VIRTIO_NET_F_CTRL_VLAN (19) Control channel VLAN filtering. + + Device configuration layout Two configuration fields are + currently defined. The mac address field always exists (though + is only valid if VIRTIO_NET_F_MAC is set), and the status field + only exists if VIRTIO_NET_F_STATUS is set. Only one bit is + currently defined for the status field: VIRTIO_NET_S_LINK_UP. #define VIRTIO_NET_S_LINK_UP 1 + + + +struct virtio_net_config { + + u8 mac[6]; + + u16 status; + +}; + + Device Initialization + + The initialization routine should identify the receive and + transmission virtqueues. + + If the VIRTIO_NET_F_MAC feature bit is set, the configuration + space “mac” entry indicates the “physical” address of the the + network card, otherwise a private MAC address should be + assigned. All guests are expected to negotiate this feature if + it is set. + + If the VIRTIO_NET_F_CTRL_VQ feature bit is negotiated, identify + the control virtqueue. + + If the VIRTIO_NET_F_STATUS feature bit is negotiated, the link + status can be read from the bottom bit of the “status” config + field. Otherwise, the link should be assumed active. + + The receive virtqueue should be filled with receive buffers. + This is described in detail below in “Setting Up Receive + Buffers”. + + A driver can indicate that it will generate checksumless + packets by negotating the VIRTIO_NET_F_CSUM feature. This “ + checksum offload” is a common feature on modern network cards. + + If that feature is negotiated, a driver can use TCP or UDP + segmentation offload by negotiating the VIRTIO_NET_F_HOST_TSO4 + (IPv4 TCP), VIRTIO_NET_F_HOST_TSO6 (IPv6 TCP) and + VIRTIO_NET_F_HOST_UFO (UDP fragmentation) features. It should + not send TCP packets requiring segmentation offload which have + the Explicit Congestion Notification bit set, unless the + VIRTIO_NET_F_HOST_ECN feature is negotiated.[footnote: +This is a common restriction in real, older network cards. +] + + The converse features are also available: a driver can save the + virtual device some work by negotiating these features.[footnote: +For example, a network packet transported between two guests on +the same system may not require checksumming at all, nor +segmentation, if both guests are amenable. +] The VIRTIO_NET_F_GUEST_CSUM feature indicates that partially + checksummed packets can be received, and if it can do that then + the VIRTIO_NET_F_GUEST_TSO4, VIRTIO_NET_F_GUEST_TSO6, + VIRTIO_NET_F_GUEST_UFO and VIRTIO_NET_F_GUEST_ECN are the input + equivalents of the features described above. See “Receiving + Packets” below. + + Device Operation + +Packets are transmitted by placing them in the transmitq, and +buffers for incoming packets are placed in the receiveq. In each +case, the packet itself is preceeded by a header: + +struct virtio_net_hdr { + +#define VIRTIO_NET_HDR_F_NEEDS_CSUM 1 + + u8 flags; + +#define VIRTIO_NET_HDR_GSO_NONE 0 + +#define VIRTIO_NET_HDR_GSO_TCPV4 1 + +#define VIRTIO_NET_HDR_GSO_UDP 3 + +#define VIRTIO_NET_HDR_GSO_TCPV6 4 + +#define VIRTIO_NET_HDR_GSO_ECN 0x80 + + u8 gso_type; + + u16 hdr_len; + + u16 gso_size; + + u16 csum_start; + + u16 csum_offset; + +/* Only if VIRTIO_NET_F_MRG_RXBUF: */ + + u16 num_buffers + +}; + +The controlq is used to control device features such as +filtering. + + Packet Transmission + +Transmitting a single packet is simple, but varies depending on +the different features the driver negotiated. + + If the driver negotiated VIRTIO_NET_F_CSUM, and the packet has + not been fully checksummed, then the virtio_net_hdr's fields + are set as follows. Otherwise, the packet must be fully + checksummed, and flags is zero. + + flags has the VIRTIO_NET_HDR_F_NEEDS_CSUM set, + + csum_start is set to the offset within + the packet to begin checksumming, and + + csum_offset indicates how many bytes after the csum_start the + new (16 bit ones' complement) checksum should be placed.[footnote: +For example, consider a partially checksummed TCP (IPv4) packet. +It will have a 14 byte ethernet header and 20 byte IP header +followed by the TCP header (with the TCP checksum field 16 bytes +into that header). csum_start will be 14+20 = 34 (the TCP +checksum includes the header), and csum_offset will be 16. The +value in the TCP checksum field will be the sum of the TCP pseudo +header, so that replacing it by the ones' complement checksum of +the TCP header and body will give the correct result. +] + + If the driver negotiated + VIRTIO_NET_F_HOST_TSO4, TSO6 or UFO, and the packet requires + TCP segmentation or UDP fragmentation, then the “gso_type” + field is set to VIRTIO_NET_HDR_GSO_TCPV4, TCPV6 or UDP. + (Otherwise, it is set to VIRTIO_NET_HDR_GSO_NONE). In this + case, packets larger than 1514 bytes can be transmitted: the + metadata indicates how to replicate the packet header to cut it + into smaller packets. The other gso fields are set: + + hdr_len is a hint to the device as to how much of the header + needs to be kept to copy into each packet, usually set to the + length of the headers, including the transport header.[footnote: +Due to various bugs in implementations, this field is not useful +as a guarantee of the transport header size. +] + + gso_size is the size of the packet beyond that header (ie. + MSS). + + If the driver negotiated the VIRTIO_NET_F_HOST_ECN feature, the + VIRTIO_NET_HDR_GSO_ECN bit may be set in “gso_type” as well, + indicating that the TCP packet has the ECN bit set.[footnote: +This case is not handled by some older hardware, so is called out +specifically in the protocol. +] + + If the driver negotiated the VIRTIO_NET_F_MRG_RXBUF feature, + the num_buffers field is set to zero. + + The header and packet are added as one output buffer to the + transmitq, and the device is notified of the new entry (see [sub:Notifying-The-Device] + ).[footnote: +Note that the header will be two bytes longer for the +VIRTIO_NET_F_MRG_RXBUF case. +] + + Packet Transmission Interrupt + +Often a driver will suppress transmission interrupts using the +VRING_AVAIL_F_NO_INTERRUPT flag (see [sub:Receiving-Used-Buffers] +) and check for used packets in the transmit path of following +packets. However, it will still receive interrupts if the +VIRTIO_F_NOTIFY_ON_EMPTY feature is negotiated, indicating that +the transmission queue is completely emptied. + +The normal behavior in this interrupt handler is to retrieve and +new descriptors from the used ring and free the corresponding +headers and packets. + + Setting Up Receive Buffers + +It is generally a good idea to keep the receive virtqueue as +fully populated as possible: if it runs out, network performance +will suffer. + +If the VIRTIO_NET_F_GUEST_TSO4, VIRTIO_NET_F_GUEST_TSO6 or +VIRTIO_NET_F_GUEST_UFO features are used, the Guest will need to +accept packets of up to 65550 bytes long (the maximum size of a +TCP or UDP packet, plus the 14 byte ethernet header), otherwise +1514 bytes. So unless VIRTIO_NET_F_MRG_RXBUF is negotiated, every +buffer in the receive queue needs to be at least this length [footnote: +Obviously each one can be split across multiple descriptor +elements. +]. + +If VIRTIO_NET_F_MRG_RXBUF is negotiated, each buffer must be at +least the size of the struct virtio_net_hdr. + + Packet Receive Interrupt + +When a packet is copied into a buffer in the receiveq, the +optimal path is to disable further interrupts for the receiveq +(see [sub:Receiving-Used-Buffers]) and process packets until no +more are found, then re-enable them. + +Processing packet involves: + + If the driver negotiated the VIRTIO_NET_F_MRG_RXBUF feature, + then the “num_buffers” field indicates how many descriptors + this packet is spread over (including this one). This allows + receipt of large packets without having to allocate large + buffers. In this case, there will be at least “num_buffers” in + the used ring, and they should be chained together to form a + single packet. The other buffers will not begin with a struct + virtio_net_hdr. + + If the VIRTIO_NET_F_MRG_RXBUF feature was not negotiated, or + the “num_buffers” field is one, then the entire packet will be + contained within this buffer, immediately following the struct + virtio_net_hdr. + + If the VIRTIO_NET_F_GUEST_CSUM feature was negotiated, the + VIRTIO_NET_HDR_F_NEEDS_CSUM bit in the “flags” field may be + set: if so, the checksum on the packet is incomplete and the “ + csum_start” and “csum_offset” fields indicate how to calculate + it (see [ite:csum_start-is-set]). + + If the VIRTIO_NET_F_GUEST_TSO4, TSO6 or UFO options were + negotiated, then the “gso_type” may be something other than + VIRTIO_NET_HDR_GSO_NONE, and the “gso_size” field indicates the + desired MSS (see [enu:If-the-driver]).Control Virtqueue + +The driver uses the control virtqueue (if VIRTIO_NET_F_VTRL_VQ is +negotiated) to send commands to manipulate various features of +the device which would not easily map into the configuration +space. + +All commands are of the following form: + +struct virtio_net_ctrl { + + u8 class; + + u8 command; + + u8 command-specific-data[]; + + u8 ack; + +}; + + + +/* ack values */ + +#define VIRTIO_NET_OK 0 + +#define VIRTIO_NET_ERR 1 + +The class, command and command-specific-data are set by the +driver, and the device sets the ack byte. There is little it can +do except issue a diagnostic if the ack byte is not +VIRTIO_NET_OK. + + Packet Receive Filtering + +If the VIRTIO_NET_F_CTRL_RX feature is negotiated, the driver can +send control commands for promiscuous mode, multicast receiving, +and filtering of MAC addresses. + +Note that in general, these commands are best-effort: unwanted +packets may still arrive. + + Setting Promiscuous Mode + +#define VIRTIO_NET_CTRL_RX 0 + + #define VIRTIO_NET_CTRL_RX_PROMISC 0 + + #define VIRTIO_NET_CTRL_RX_ALLMULTI 1 + +The class VIRTIO_NET_CTRL_RX has two commands: +VIRTIO_NET_CTRL_RX_PROMISC turns promiscuous mode on and off, and +VIRTIO_NET_CTRL_RX_ALLMULTI turns all-multicast receive on and +off. The command-specific-data is one byte containing 0 (off) or +1 (on). + + Setting MAC Address Filtering + +struct virtio_net_ctrl_mac { + + u32 entries; + + u8 macs[entries][ETH_ALEN]; + +}; + + + +#define VIRTIO_NET_CTRL_MAC 1 + + #define VIRTIO_NET_CTRL_MAC_TABLE_SET 0 + +The device can filter incoming packets by any number of +destination MAC addresses.[footnote: +Since there are no guarentees, it can use a hash filter +orsilently switch to allmulti or promiscuous mode if it is given +too many addresses. +] This table is set using the class VIRTIO_NET_CTRL_MAC and the +command VIRTIO_NET_CTRL_MAC_TABLE_SET. The command-specific-data +is two variable length tables of 6-byte MAC addresses. The first +table contains unicast addresses, and the second contains +multicast addresses. + + VLAN Filtering + +If the driver negotiates the VIRTION_NET_F_CTRL_VLAN feature, it +can control a VLAN filter table in the device. + +#define VIRTIO_NET_CTRL_VLAN 2 + + #define VIRTIO_NET_CTRL_VLAN_ADD 0 + + #define VIRTIO_NET_CTRL_VLAN_DEL 1 + +Both the VIRTIO_NET_CTRL_VLAN_ADD and VIRTIO_NET_CTRL_VLAN_DEL +command take a 16-bit VLAN id as the command-specific-data. + +Appendix D: Block Device + +The virtio block device is a simple virtual block device (ie. +disk). Read and write requests (and other exotic requests) are +placed in the queue, and serviced (probably out of order) by the +device except where noted. + + Configuration + + Subsystem Device ID 2 + + Virtqueues 0:requestq. + + Feature bits + + VIRTIO_BLK_F_BARRIER (0) Host supports request barriers. + + VIRTIO_BLK_F_SIZE_MAX (1) Maximum size of any single segment is + in “size_max”. + + VIRTIO_BLK_F_SEG_MAX (2) Maximum number of segments in a + request is in “seg_max”. + + VIRTIO_BLK_F_GEOMETRY (4) Disk-style geometry specified in “ + geometry”. + + VIRTIO_BLK_F_RO (5) Device is read-only. + + VIRTIO_BLK_F_BLK_SIZE (6) Block size of disk is in “blk_size”. + + VIRTIO_BLK_F_SCSI (7) Device supports scsi packet commands. + + VIRTIO_BLK_F_FLUSH (9) Cache flush command support. + + + + Device configuration layout The capacity of the device + (expressed in 512-byte sectors) is always present. The + availability of the others all depend on various feature bits + as indicated above. struct virtio_blk_config { + + u64 capacity; + + u32 size_max; + + u32 seg_max; + + struct virtio_blk_geometry { + + u16 cylinders; + + u8 heads; + + u8 sectors; + + } geometry; + + u32 blk_size; + + + +}; + + Device Initialization + + The device size should be read from the “capacity” + configuration field. No requests should be submitted which goes + beyond this limit. + + If the VIRTIO_BLK_F_BLK_SIZE feature is negotiated, the + blk_size field can be read to determine the optimal sector size + for the driver to use. This does not effect the units used in + the protocol (always 512 bytes), but awareness of the correct + value can effect performance. + + If the VIRTIO_BLK_F_RO feature is set by the device, any write + requests will fail. + + + + Device Operation + +The driver queues requests to the virtqueue, and they are used by +the device (not necessarily in order). Each request is of form: + +struct virtio_blk_req { + + + + u32 type; + + u32 ioprio; + + u64 sector; + + char data[][512]; + + u8 status; + +}; + +If the device has VIRTIO_BLK_F_SCSI feature, it can also support +scsi packet command requests, each of these requests is of form:struct virtio_scsi_pc_req { + + u32 type; + + u32 ioprio; + + u64 sector; + + char cmd[]; + + char data[][512]; + +#define SCSI_SENSE_BUFFERSIZE 96 + + u8 sense[SCSI_SENSE_BUFFERSIZE]; + + u32 errors; + + u32 data_len; + + u32 sense_len; + + u32 residual; + + u8 status; + +}; + +The type of the request is either a read (VIRTIO_BLK_T_IN), a +write (VIRTIO_BLK_T_OUT), a scsi packet command +(VIRTIO_BLK_T_SCSI_CMD or VIRTIO_BLK_T_SCSI_CMD_OUT[footnote: +the SCSI_CMD and SCSI_CMD_OUT types are equivalent, the device +does not distinguish between them +]) or a flush (VIRTIO_BLK_T_FLUSH or VIRTIO_BLK_T_FLUSH_OUT[footnote: +the FLUSH and FLUSH_OUT types are equivalent, the device does not +distinguish between them +]). If the device has VIRTIO_BLK_F_BARRIER feature the high bit +(VIRTIO_BLK_T_BARRIER) indicates that this request acts as a +barrier and that all preceeding requests must be complete before +this one, and all following requests must not be started until +this is complete. Note that a barrier does not flush caches in +the underlying backend device in host, and thus does not serve as +data consistency guarantee. Driver must use FLUSH request to +flush the host cache. + +#define VIRTIO_BLK_T_IN 0 + +#define VIRTIO_BLK_T_OUT 1 + +#define VIRTIO_BLK_T_SCSI_CMD 2 + +#define VIRTIO_BLK_T_SCSI_CMD_OUT 3 + +#define VIRTIO_BLK_T_FLUSH 4 + +#define VIRTIO_BLK_T_FLUSH_OUT 5 + +#define VIRTIO_BLK_T_BARRIER 0x80000000 + +The ioprio field is a hint about the relative priorities of +requests to the device: higher numbers indicate more important +requests. + +The sector number indicates the offset (multiplied by 512) where +the read or write is to occur. This field is unused and set to 0 +for scsi packet commands and for flush commands. + +The cmd field is only present for scsi packet command requests, +and indicates the command to perform. This field must reside in a +single, separate read-only buffer; command length can be derived +from the length of this buffer. + +Note that these first three (four for scsi packet commands) +fields are always read-only: the data field is either read-only +or write-only, depending on the request. The size of the read or +write can be derived from the total size of the request buffers. + +The sense field is only present for scsi packet command requests, +and indicates the buffer for scsi sense data. + +The data_len field is only present for scsi packet command +requests, this field is deprecated, and should be ignored by the +driver. Historically, devices copied data length there. + +The sense_len field is only present for scsi packet command +requests and indicates the number of bytes actually written to +the sense buffer. + +The residual field is only present for scsi packet command +requests and indicates the residual size, calculated as data +length - number of bytes actually transferred. + +The final status byte is written by the device: either +VIRTIO_BLK_S_OK for success, VIRTIO_BLK_S_IOERR for host or guest +error or VIRTIO_BLK_S_UNSUPP for a request unsupported by host:#define VIRTIO_BLK_S_OK 0 + +#define VIRTIO_BLK_S_IOERR 1 + +#define VIRTIO_BLK_S_UNSUPP 2 + +Historically, devices assumed that the fields type, ioprio and +sector reside in a single, separate read-only buffer; the fields +errors, data_len, sense_len and residual reside in a single, +separate write-only buffer; the sense field in a separate +write-only buffer of size 96 bytes, by itself; the fields errors, +data_len, sense_len and residual in a single write-only buffer; +and the status field is a separate read-only buffer of size 1 +byte, by itself. + +Appendix E: Console Device + +The virtio console device is a simple device for data input and +output. A device may have one or more ports. Each port has a pair +of input and output virtqueues. Moreover, a device has a pair of +control IO virtqueues. The control virtqueues are used to +communicate information between the device and the driver about +ports being opened and closed on either side of the connection, +indication from the host about whether a particular port is a +console port, adding new ports, port hot-plug/unplug, etc., and +indication from the guest about whether a port or a device was +successfully added, port open/close, etc.. For data IO, one or +more empty buffers are placed in the receive queue for incoming +data and outgoing characters are placed in the transmit queue. + + Configuration + + Subsystem Device ID 3 + + Virtqueues 0:receiveq(port0). 1:transmitq(port0), 2:control + receiveq[footnote: +Ports 2 onwards only if VIRTIO_CONSOLE_F_MULTIPORT is set +], 3:control transmitq, 4:receiveq(port1), 5:transmitq(port1), + ... + + Feature bits + + VIRTIO_CONSOLE_F_SIZE (0) Configuration cols and rows fields + are valid. + + VIRTIO_CONSOLE_F_MULTIPORT(1) Device has support for multiple + ports; configuration fields nr_ports and max_nr_ports are + valid and control virtqueues will be used. + + Device configuration layout The size of the console is supplied + in the configuration space if the VIRTIO_CONSOLE_F_SIZE feature + is set. Furthermore, if the VIRTIO_CONSOLE_F_MULTIPORT feature + is set, the maximum number of ports supported by the device can + be fetched.struct virtio_console_config { + + u16 cols; + + u16 rows; + + + + u32 max_nr_ports; + +}; + + Device Initialization + + If the VIRTIO_CONSOLE_F_SIZE feature is negotiated, the driver + can read the console dimensions from the configuration fields. + + If the VIRTIO_CONSOLE_F_MULTIPORT feature is negotiated, the + driver can spawn multiple ports, not all of which may be + attached to a console. Some could be generic ports. In this + case, the control virtqueues are enabled and according to the + max_nr_ports configuration-space value, the appropriate number + of virtqueues are created. A control message indicating the + driver is ready is sent to the host. The host can then send + control messages for adding new ports to the device. After + creating and initializing each port, a + VIRTIO_CONSOLE_PORT_READY control message is sent to the host + for that port so the host can let us know of any additional + configuration options set for that port. + + The receiveq for each port is populated with one or more + receive buffers. + + Device Operation + + For output, a buffer containing the characters is placed in the + port's transmitq.[footnote: +Because this is high importance and low bandwidth, the current +Linux implementation polls for the buffer to be used, rather than +waiting for an interrupt, simplifying the implementation +significantly. However, for generic serial ports with the +O_NONBLOCK flag set, the polling limitation is relaxed and the +consumed buffers are freed upon the next write or poll call or +when a port is closed or hot-unplugged. +] + + When a buffer is used in the receiveq (signalled by an + interrupt), the contents is the input to the port associated + with the virtqueue for which the notification was received. + + If the driver negotiated the VIRTIO_CONSOLE_F_SIZE feature, a + configuration change interrupt may occur. The updated size can + be read from the configuration fields. + + If the driver negotiated the VIRTIO_CONSOLE_F_MULTIPORT + feature, active ports are announced by the host using the + VIRTIO_CONSOLE_PORT_ADD control message. The same message is + used for port hot-plug as well. + + If the host specified a port `name', a sysfs attribute is + created with the name filled in, so that udev rules can be + written that can create a symlink from the port's name to the + char device for port discovery by applications in the guest. + + Changes to ports' state are effected by control messages. + Appropriate action is taken on the port indicated in the + control message. The layout of the structure of the control + buffer and the events associated are:struct virtio_console_control { + + uint32_t id; /* Port number */ + + uint16_t event; /* The kind of control event */ + + uint16_t value; /* Extra information for the event */ + +}; + + + +/* Some events for the internal messages (control packets) */ + + + +#define VIRTIO_CONSOLE_DEVICE_READY 0 + +#define VIRTIO_CONSOLE_PORT_ADD 1 + +#define VIRTIO_CONSOLE_PORT_REMOVE 2 + +#define VIRTIO_CONSOLE_PORT_READY 3 + +#define VIRTIO_CONSOLE_CONSOLE_PORT 4 + +#define VIRTIO_CONSOLE_RESIZE 5 + +#define VIRTIO_CONSOLE_PORT_OPEN 6 + +#define VIRTIO_CONSOLE_PORT_NAME 7 + +Appendix F: Entropy Device + +The virtio entropy device supplies high-quality randomness for +guest use. + + Configuration + + Subsystem Device ID 4 + + Virtqueues 0:requestq. + + Feature bits None currently defined + + Device configuration layout None currently defined. + + Device Initialization + + The virtqueue is initialized + + Device Operation + +When the driver requires random bytes, it places the descriptor +of one or more buffers in the queue. It will be completely filled +by random data by the device. + +Appendix G: Memory Balloon Device + +The virtio memory balloon device is a primitive device for +managing guest memory: the device asks for a certain amount of +memory, and the guest supplies it (or withdraws it, if the device +has more than it asks for). This allows the guest to adapt to +changes in allowance of underlying physical memory. If the +feature is negotiated, the device can also be used to communicate +guest memory statistics to the host. + + Configuration + + Subsystem Device ID 5 + + Virtqueues 0:inflateq. 1:deflateq. 2:statsq.[footnote: +Only if VIRTIO_BALLON_F_STATS_VQ set +] + + Feature bits + + VIRTIO_BALLOON_F_MUST_TELL_HOST (0) Host must be told before + pages from the balloon are used. + + VIRTIO_BALLOON_F_STATS_VQ (1) A virtqueue for reporting guest + memory statistics is present. + + Device configuration layout Both fields of this configuration + are always available. Note that they are little endian, despite + convention that device fields are guest endian:struct virtio_balloon_config { + + u32 num_pages; + + u32 actual; + +}; + + Device Initialization + + The inflate and deflate virtqueues are identified. + + If the VIRTIO_BALLOON_F_STATS_VQ feature bit is negotiated: + + Identify the stats virtqueue. + + Add one empty buffer to the stats virtqueue and notify the + host. + +Device operation begins immediately. + + Device Operation + + Memory Ballooning The device is driven by the receipt of a + configuration change interrupt. + + The “num_pages” configuration field is examined. If this is + greater than the “actual” number of pages, memory must be given + to the balloon. If it is less than the “actual” number of + pages, memory may be taken back from the balloon for general + use. + + To supply memory to the balloon (aka. inflate): + + The driver constructs an array of addresses of unused memory + pages. These addresses are divided by 4096[footnote: +This is historical, and independent of the guest page size +] and the descriptor describing the resulting 32-bit array is + added to the inflateq. + + To remove memory from the balloon (aka. deflate): + + The driver constructs an array of addresses of memory pages it + has previously given to the balloon, as described above. This + descriptor is added to the deflateq. + + If the VIRTIO_BALLOON_F_MUST_TELL_HOST feature is set, the + guest may not use these requested pages until that descriptor + in the deflateq has been used by the device. + + Otherwise, the guest may begin to re-use pages previously given + to the balloon before the device has acknowledged their + withdrawl. [footnote: +In this case, deflation advice is merely a courtesy +] + + In either case, once the device has completed the inflation or + deflation, the “actual” field of the configuration should be + updated to reflect the new number of pages in the balloon.[footnote: +As updates to configuration space are not atomic, this field +isn't particularly reliable, but can be used to diagnose buggy +guests. +] + + Memory Statistics + +The stats virtqueue is atypical because communication is driven +by the device (not the driver). The channel becomes active at +driver initialization time when the driver adds an empty buffer +and notifies the device. A request for memory statistics proceeds +as follows: + + The device pushes the buffer onto the used ring and sends an + interrupt. + + The driver pops the used buffer and discards it. + + The driver collects memory statistics and writes them into a + new buffer. + + The driver adds the buffer to the virtqueue and notifies the + device. + + The device pops the buffer (retaining it to initiate a + subsequent request) and consumes the statistics. + + Memory Statistics Format Each statistic consists of a 16 bit + tag and a 64 bit value. Both quantities are represented in the + native endian of the guest. All statistics are optional and the + driver may choose which ones to supply. To guarantee backwards + compatibility, unsupported statistics should be omitted. + + struct virtio_balloon_stat { + +#define VIRTIO_BALLOON_S_SWAP_IN 0 + +#define VIRTIO_BALLOON_S_SWAP_OUT 1 + +#define VIRTIO_BALLOON_S_MAJFLT 2 + +#define VIRTIO_BALLOON_S_MINFLT 3 + +#define VIRTIO_BALLOON_S_MEMFREE 4 + +#define VIRTIO_BALLOON_S_MEMTOT 5 + + u16 tag; + + u64 val; + +} __attribute__((packed)); + + Tags + + VIRTIO_BALLOON_S_SWAP_IN The amount of memory that has been + swapped in (in bytes). + + VIRTIO_BALLOON_S_SWAP_OUT The amount of memory that has been + swapped out to disk (in bytes). + + VIRTIO_BALLOON_S_MAJFLT The number of major page faults that + have occurred. + + VIRTIO_BALLOON_S_MINFLT The number of minor page faults that + have occurred. + + VIRTIO_BALLOON_S_MEMFREE The amount of memory not being used + for any purpose (in bytes). + + VIRTIO_BALLOON_S_MEMTOT The total amount of memory available + (in bytes). + diff --git a/MAINTAINERS b/MAINTAINERS index 1e55e1eeb811..28f65c249b97 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -1883,7 +1883,7 @@ S: Maintained F: drivers/connector/ CONTROL GROUPS (CGROUPS) -M: Paul Menage +M: Paul Menage M: Li Zefan L: containers@lists.linux-foundation.org S: Maintained @@ -1932,7 +1932,7 @@ S: Maintained F: tools/power/cpupower CPUSETS -M: Paul Menage +M: Paul Menage W: http://www.bullopensource.org/cpuset/ W: http://oss.sgi.com/projects/cpusets/ S: Supported @@ -2649,11 +2649,11 @@ F: drivers/net/wan/dlci.c F: drivers/net/wan/sdla.c FRAMEBUFFER LAYER -M: Paul Mundt +M: Florian Tobias Schandinat L: linux-fbdev@vger.kernel.org W: http://linux-fbdev.sourceforge.net/ Q: http://patchwork.kernel.org/project/linux-fbdev/list/ -T: git git://git.kernel.org/pub/scm/linux/kernel/git/lethal/fbdev-2.6.git +T: git git://github.com/schandinat/linux-2.6.git fbdev-next S: Maintained F: Documentation/fb/ F: Documentation/devicetree/bindings/fb/ @@ -4450,8 +4450,8 @@ M: "David S. Miller" L: netdev@vger.kernel.org W: http://www.linuxfoundation.org/en/Net W: http://patchwork.ozlabs.org/project/netdev/list/ -T: git git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6.git -T: git git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6.git +T: git git://git.kernel.org/pub/scm/linux/kernel/git/davem/net.git +T: git git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git S: Maintained F: net/ F: include/net/ @@ -4604,7 +4604,7 @@ F: arch/arm/mach-omap2/clockdomain2xxx_3xxx.c F: arch/arm/mach-omap2/clockdomain44xx.c OMAP AUDIO SUPPORT -M: Jarkko Nikula +M: Jarkko Nikula L: alsa-devel@alsa-project.org (subscribers-only) L: linux-omap@vger.kernel.org S: Maintained @@ -4971,7 +4971,7 @@ M: Paul Mackerras M: Ingo Molnar M: Arnaldo Carvalho de Melo S: Supported -F: kernel/perf_event*.c +F: kernel/events/* F: include/linux/perf_event.h F: arch/*/kernel/perf_event*.c F: arch/*/kernel/*/perf_event*.c @@ -5532,6 +5532,7 @@ F: include/media/*7146* SAMSUNG AUDIO (ASoC) DRIVERS M: Jassi Brar +M: Sangbeom Kim L: alsa-devel@alsa-project.org (moderated for non-subscribers) S: Supported F: sound/soc/samsung @@ -7087,7 +7088,7 @@ S: Supported F: drivers/mmc/host/vub300.c W1 DALLAS'S 1-WIRE BUS -M: Evgeniy Polyakov +M: Evgeniy Polyakov S: Maintained F: Documentation/w1/ F: drivers/w1/ diff --git a/Makefile b/Makefile index b4ca4e111c9a..522fa4784e69 100644 --- a/Makefile +++ b/Makefile @@ -1,8 +1,8 @@ VERSION = 3 PATCHLEVEL = 1 SUBLEVEL = 0 -EXTRAVERSION = -rc1 -NAME = Sneaky Weasel +EXTRAVERSION = -rc6 +NAME = "Divemaster Edition" # *DOCUMENTATION* # To see a list of typical targets execute "make help" diff --git a/arch/alpha/include/asm/sysinfo.h b/arch/alpha/include/asm/sysinfo.h index 086aba284df2..e77d77cd07b8 100644 --- a/arch/alpha/include/asm/sysinfo.h +++ b/arch/alpha/include/asm/sysinfo.h @@ -27,13 +27,4 @@ #define UAC_NOFIX 2 #define UAC_SIGBUS 4 - -#ifdef __KERNEL__ - -/* This is the shift that is applied to the UAC bits as stored in the - per-thread flags. See thread_info.h. */ -#define UAC_SHIFT 6 - -#endif - #endif /* __ASM_ALPHA_SYSINFO_H */ diff --git a/arch/alpha/include/asm/thread_info.h b/arch/alpha/include/asm/thread_info.h index 6f32f9c84a2d..ff73db022342 100644 --- a/arch/alpha/include/asm/thread_info.h +++ b/arch/alpha/include/asm/thread_info.h @@ -74,9 +74,9 @@ register struct thread_info *__current_thread_info __asm__("$8"); #define TIF_NEED_RESCHED 3 /* rescheduling necessary */ #define TIF_POLLING_NRFLAG 8 /* poll_idle is polling NEED_RESCHED */ #define TIF_DIE_IF_KERNEL 9 /* dik recursion lock */ -#define TIF_UAC_NOPRINT 10 /* see sysinfo.h */ -#define TIF_UAC_NOFIX 11 -#define TIF_UAC_SIGBUS 12 +#define TIF_UAC_NOPRINT 10 /* ! Preserve sequence of following */ +#define TIF_UAC_NOFIX 11 /* ! flags as they match */ +#define TIF_UAC_SIGBUS 12 /* ! userspace part of 'osf_sysinfo' */ #define TIF_MEMDIE 13 /* is terminating due to OOM killer */ #define TIF_RESTORE_SIGMASK 14 /* restore signal mask in do_signal */ #define TIF_FREEZE 16 /* is freezing for suspend */ @@ -97,7 +97,7 @@ register struct thread_info *__current_thread_info __asm__("$8"); #define _TIF_ALLWORK_MASK (_TIF_WORK_MASK \ | _TIF_SYSCALL_TRACE) -#define ALPHA_UAC_SHIFT 10 +#define ALPHA_UAC_SHIFT TIF_UAC_NOPRINT #define ALPHA_UAC_MASK (1 << TIF_UAC_NOPRINT | 1 << TIF_UAC_NOFIX | \ 1 << TIF_UAC_SIGBUS) diff --git a/arch/alpha/kernel/osf_sys.c b/arch/alpha/kernel/osf_sys.c index 326f0a2d56e5..01e8715e26d9 100644 --- a/arch/alpha/kernel/osf_sys.c +++ b/arch/alpha/kernel/osf_sys.c @@ -42,6 +42,7 @@ #include #include #include +#include #include #include @@ -633,9 +634,10 @@ SYSCALL_DEFINE5(osf_getsysinfo, unsigned long, op, void __user *, buffer, case GSI_UACPROC: if (nbytes < sizeof(unsigned int)) return -EINVAL; - w = (current_thread_info()->flags >> UAC_SHIFT) & UAC_BITMASK; - if (put_user(w, (unsigned int __user *)buffer)) - return -EFAULT; + w = (current_thread_info()->flags >> ALPHA_UAC_SHIFT) & + UAC_BITMASK; + if (put_user(w, (unsigned int __user *)buffer)) + return -EFAULT; return 1; case GSI_PROC_TYPE: @@ -756,8 +758,8 @@ SYSCALL_DEFINE5(osf_setsysinfo, unsigned long, op, void __user *, buffer, case SSIN_UACPROC: again: old = current_thread_info()->flags; - new = old & ~(UAC_BITMASK << UAC_SHIFT); - new = new | (w & UAC_BITMASK) << UAC_SHIFT; + new = old & ~(UAC_BITMASK << ALPHA_UAC_SHIFT); + new = new | (w & UAC_BITMASK) << ALPHA_UAC_SHIFT; if (cmpxchg(¤t_thread_info()->flags, old, new) != old) goto again; diff --git a/arch/alpha/kernel/systbls.S b/arch/alpha/kernel/systbls.S index b9c28f3f1956..6acea1f96de3 100644 --- a/arch/alpha/kernel/systbls.S +++ b/arch/alpha/kernel/systbls.S @@ -360,7 +360,7 @@ sys_call_table: .quad sys_newuname .quad sys_nanosleep /* 340 */ .quad sys_mremap - .quad sys_nfsservctl + .quad sys_ni_syscall /* old nfsservctl */ .quad sys_setresuid .quad sys_getresuid .quad sys_pciconfig_read /* 345 */ diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig index 3269576dbfa8..5a3a78633177 100644 --- a/arch/arm/Kconfig +++ b/arch/arm/Kconfig @@ -1793,6 +1793,38 @@ config ZBOOT_ROM_SH_MOBILE_SDHI endchoice +config ARM_APPENDED_DTB + bool "Use appended device tree blob to zImage (EXPERIMENTAL)" + depends on OF && !ZBOOT_ROM && EXPERIMENTAL + help + With this option, the boot code will look for a device tree binary + (DTB) appended to zImage + (e.g. cat zImage .dtb > zImage_w_dtb). + + This is meant as a backward compatibility convenience for those + systems with a bootloader that can't be upgraded to accommodate + the documented boot protocol using a device tree. + + Beware that there is very little in terms of protection against + this option being confused by leftover garbage in memory that might + look like a DTB header after a reboot if no actual DTB is appended + to zImage. Do not leave this option active in a production kernel + if you don't intend to always append a DTB. Proper passing of the + location into r2 of a bootloader provided DTB is always preferable + to this option. + +config ARM_ATAG_DTB_COMPAT + bool "Supplement the appended DTB with traditional ATAG information" + depends on ARM_APPENDED_DTB + help + Some old bootloaders can't be updated to a DTB capable one, yet + they provide ATAGs with memory configuration, the ramdisk address, + the kernel cmdline string, etc. Such information is dynamically + provided by the bootloader and can't always be stored in a static + DTB. To allow a device tree enabled kernel to be used with such + bootloaders, this option allows zImage to extract the information + from the ATAG list and store it at run time into the appended DTB. + config CMDLINE string "Default kernel command string" default "" diff --git a/arch/arm/Kconfig.debug b/arch/arm/Kconfig.debug index 81cbe40c159c..be3a0f78d915 100644 --- a/arch/arm/Kconfig.debug +++ b/arch/arm/Kconfig.debug @@ -129,4 +129,10 @@ config DEBUG_S3C_UART The uncompressor code port configuration is now handled by CONFIG_S3C_LOWLEVEL_UART_PORT. +config ARM_KPROBES_TEST + tristate "Kprobes test module" + depends on KPROBES && MODULES + help + Perform tests of kprobes API and instruction set simulation. + endmenu diff --git a/arch/arm/boot/compressed/.gitignore b/arch/arm/boot/compressed/.gitignore index c6028967d336..e0936a148516 100644 --- a/arch/arm/boot/compressed/.gitignore +++ b/arch/arm/boot/compressed/.gitignore @@ -5,3 +5,12 @@ piggy.lzo piggy.lzma vmlinux vmlinux.lds + +# borrowed libfdt files +fdt.c +fdt.h +fdt_ro.c +fdt_rw.c +fdt_wip.c +libfdt.h +libfdt_internal.h diff --git a/arch/arm/boot/compressed/Makefile b/arch/arm/boot/compressed/Makefile index 0c74a6fab952..e4f32a8e002a 100644 --- a/arch/arm/boot/compressed/Makefile +++ b/arch/arm/boot/compressed/Makefile @@ -26,6 +26,10 @@ HEAD = head.o OBJS += misc.o decompress.o FONTC = $(srctree)/drivers/video/console/font_acorn_8x8.c +# string library code (-Os is enforced to keep it much smaller) +OBJS += string.o +CFLAGS_string.o := -Os + # # Architecture dependencies # @@ -89,21 +93,41 @@ suffix_$(CONFIG_KERNEL_GZIP) = gzip suffix_$(CONFIG_KERNEL_LZO) = lzo suffix_$(CONFIG_KERNEL_LZMA) = lzma +# Borrowed libfdt files for the ATAG compatibility mode + +libfdt := fdt_rw.c fdt_ro.c fdt_wip.c fdt.c +libfdt_hdrs := fdt.h libfdt.h libfdt_internal.h + +libfdt_objs := $(addsuffix .o, $(basename $(libfdt))) + +$(addprefix $(obj)/,$(libfdt) $(libfdt_hdrs)): $(obj)/%: $(srctree)/scripts/dtc/libfdt/% + $(call cmd,shipped) + +$(addprefix $(obj)/,$(libfdt_objs) atags_to_fdt.o): \ + $(addprefix $(obj)/,$(libfdt_hdrs)) + +ifeq ($(CONFIG_ARM_ATAG_DTB_COMPAT),y) +OBJS += $(libfdt_objs) atags_to_fdt.o +endif + targets := vmlinux vmlinux.lds \ piggy.$(suffix_y) piggy.$(suffix_y).o \ - font.o font.c head.o misc.o $(OBJS) + lib1funcs.o lib1funcs.S font.o font.c head.o misc.o $(OBJS) # Make sure files are removed during clean -extra-y += piggy.gzip piggy.lzo piggy.lzma lib1funcs.S +extra-y += piggy.gzip piggy.lzo piggy.lzma lib1funcs.S $(libfdt) $(libfdt_hdrs) ifeq ($(CONFIG_FUNCTION_TRACER),y) ORIG_CFLAGS := $(KBUILD_CFLAGS) KBUILD_CFLAGS = $(subst -pg, , $(ORIG_CFLAGS)) endif -ccflags-y := -fpic -fno-builtin +ccflags-y := -fpic -fno-builtin -I$(obj) asflags-y := -Wa,-march=all +# Supply kernel BSS size to the decompressor via a linker symbol. +KBSS_SZ = $(shell size $(obj)/../../../../vmlinux | awk 'END{print $$3}') +LDFLAGS_vmlinux = --defsym _kernel_bss_size=$(KBSS_SZ) # Supply ZRELADDR to the decompressor via a linker symbol. ifneq ($(CONFIG_AUTO_ZRELADDR),y) LDFLAGS_vmlinux += --defsym zreladdr=$(ZRELADDR) @@ -123,7 +147,7 @@ LDFLAGS_vmlinux += -T # For __aeabi_uidivmod lib1funcs = $(obj)/lib1funcs.o -$(obj)/lib1funcs.S: $(srctree)/arch/$(SRCARCH)/lib/lib1funcs.S FORCE +$(obj)/lib1funcs.S: $(srctree)/arch/$(SRCARCH)/lib/lib1funcs.S $(call cmd,shipped) # We need to prevent any GOTOFF relocs being used with references diff --git a/arch/arm/boot/compressed/atags_to_fdt.c b/arch/arm/boot/compressed/atags_to_fdt.c new file mode 100644 index 000000000000..6ce11c481178 --- /dev/null +++ b/arch/arm/boot/compressed/atags_to_fdt.c @@ -0,0 +1,97 @@ +#include +#include + +static int node_offset(void *fdt, const char *node_path) +{ + int offset = fdt_path_offset(fdt, node_path); + if (offset == -FDT_ERR_NOTFOUND) + offset = fdt_add_subnode(fdt, 0, node_path); + return offset; +} + +static int setprop(void *fdt, const char *node_path, const char *property, + uint32_t *val_array, int size) +{ + int offset = node_offset(fdt, node_path); + if (offset < 0) + return offset; + return fdt_setprop(fdt, offset, property, val_array, size); +} + +static int setprop_string(void *fdt, const char *node_path, + const char *property, const char *string) +{ + int offset = node_offset(fdt, node_path); + if (offset < 0) + return offset; + return fdt_setprop_string(fdt, offset, property, string); +} + +static int setprop_cell(void *fdt, const char *node_path, + const char *property, uint32_t val) +{ + int offset = node_offset(fdt, node_path); + if (offset < 0) + return offset; + return fdt_setprop_cell(fdt, offset, property, val); +} + +/* + * Convert and fold provided ATAGs into the provided FDT. + * + * REturn values: + * = 0 -> pretend success + * = 1 -> bad ATAG (may retry with another possible ATAG pointer) + * < 0 -> error from libfdt + */ +int atags_to_fdt(void *atag_list, void *fdt, int total_space) +{ + struct tag *atag = atag_list; + uint32_t mem_reg_property[2 * NR_BANKS]; + int memcount = 0; + int ret; + + /* make sure we've got an aligned pointer */ + if ((u32)atag_list & 0x3) + return 1; + + /* if we get a DTB here we're done already */ + if (*(u32 *)atag_list == fdt32_to_cpu(FDT_MAGIC)) + return 0; + + /* validate the ATAG */ + if (atag->hdr.tag != ATAG_CORE || + (atag->hdr.size != tag_size(tag_core) && + atag->hdr.size != 2)) + return 1; + + /* let's give it all the room it could need */ + ret = fdt_open_into(fdt, fdt, total_space); + if (ret < 0) + return ret; + + for_each_tag(atag, atag_list) { + if (atag->hdr.tag == ATAG_CMDLINE) { + setprop_string(fdt, "/chosen", "bootargs", + atag->u.cmdline.cmdline); + } else if (atag->hdr.tag == ATAG_MEM) { + if (memcount >= sizeof(mem_reg_property)/4) + continue; + mem_reg_property[memcount++] = cpu_to_fdt32(atag->u.mem.start); + mem_reg_property[memcount++] = cpu_to_fdt32(atag->u.mem.size); + } else if (atag->hdr.tag == ATAG_INITRD2) { + uint32_t initrd_start, initrd_size; + initrd_start = atag->u.initrd.start; + initrd_size = atag->u.initrd.size; + setprop_cell(fdt, "/chosen", "linux,initrd-start", + initrd_start); + setprop_cell(fdt, "/chosen", "linux,initrd-end", + initrd_start + initrd_size); + } + } + + if (memcount) + setprop(fdt, "/memory", "reg", mem_reg_property, 4*memcount); + + return fdt_pack(fdt); +} diff --git a/arch/arm/boot/compressed/head.S b/arch/arm/boot/compressed/head.S index e95a5989602a..9f5ac11ccd8e 100644 --- a/arch/arm/boot/compressed/head.S +++ b/arch/arm/boot/compressed/head.S @@ -216,6 +216,103 @@ restart: adr r0, LC0 mov r10, r6 #endif + mov r5, #0 @ init dtb size to 0 +#ifdef CONFIG_ARM_APPENDED_DTB +/* + * r0 = delta + * r2 = BSS start + * r3 = BSS end + * r4 = final kernel address + * r5 = appended dtb size (still unknown) + * r6 = _edata + * r7 = architecture ID + * r8 = atags/device tree pointer + * r9 = size of decompressed image + * r10 = end of this image, including bss/stack/malloc space if non XIP + * r11 = GOT start + * r12 = GOT end + * sp = stack pointer + * + * if there are device trees (dtb) appended to zImage, advance r10 so that the + * dtb data will get relocated along with the kernel if necessary. + */ + + ldr lr, [r6, #0] +#ifndef __ARMEB__ + ldr r1, =0xedfe0dd0 @ sig is 0xd00dfeed big endian +#else + ldr r1, =0xd00dfeed +#endif + cmp lr, r1 + bne dtb_check_done @ not found + +#ifdef CONFIG_ARM_ATAG_DTB_COMPAT + /* + * OK... Let's do some funky business here. + * If we do have a DTB appended to zImage, and we do have + * an ATAG list around, we want the later to be translated + * and folded into the former here. To be on the safe side, + * let's temporarily move the stack away into the malloc + * area. No GOT fixup has occurred yet, but none of the + * code we're about to call uses any global variable. + */ + add sp, sp, #0x10000 + stmfd sp!, {r0-r3, ip, lr} + mov r0, r8 + mov r1, r6 + sub r2, sp, r6 + bl atags_to_fdt + + /* + * If returned value is 1, there is no ATAG at the location + * pointed by r8. Try the typical 0x100 offset from start + * of RAM and hope for the best. + */ + cmp r0, #1 + sub r0, r4, #(TEXT_OFFSET - 0x100) + mov r1, r6 + sub r2, sp, r6 + blne atags_to_fdt + + ldmfd sp!, {r0-r3, ip, lr} + sub sp, sp, #0x10000 +#endif + + mov r8, r6 @ use the appended device tree + + /* + * Make sure that the DTB doesn't end up in the final + * kernel's .bss area. To do so, we adjust the decompressed + * kernel size to compensate if that .bss size is larger + * than the relocated code. + */ + ldr r5, =_kernel_bss_size + adr r1, wont_overwrite + sub r1, r6, r1 + subs r1, r5, r1 + addhi r9, r9, r1 + + /* Get the dtb's size */ + ldr r5, [r6, #4] +#ifndef __ARMEB__ + /* convert r5 (dtb size) to little endian */ + eor r1, r5, r5, ror #16 + bic r1, r1, #0x00ff0000 + mov r5, r5, ror #8 + eor r5, r5, r1, lsr #8 +#endif + + /* preserve 64-bit alignment */ + add r5, r5, #7 + bic r5, r5, #7 + + /* relocate some pointers past the appended dtb */ + add r6, r6, r5 + add r10, r10, r5 + add sp, sp, r5 +dtb_check_done: +#endif + /* * Check to see if we will overwrite ourselves. * r4 = final kernel address @@ -223,15 +320,14 @@ restart: adr r0, LC0 * r10 = end of this image, including bss/stack/malloc space if non XIP * We basically want: * r4 - 16k page directory >= r10 -> OK - * r4 + image length <= current position (pc) -> OK + * r4 + image length <= address of wont_overwrite -> OK */ add r10, r10, #16384 cmp r4, r10 bhs wont_overwrite add r10, r4, r9 - ARM( cmp r10, pc ) - THUMB( mov lr, pc ) - THUMB( cmp r10, lr ) + adr r9, wont_overwrite + cmp r10, r9 bls wont_overwrite /* @@ -285,14 +381,16 @@ wont_overwrite: * r2 = BSS start * r3 = BSS end * r4 = kernel execution address + * r5 = appended dtb size (0 if not present) * r7 = architecture ID * r8 = atags pointer * r11 = GOT start * r12 = GOT end * sp = stack pointer */ - teq r0, #0 + orrs r1, r0, r5 beq not_relocated + add r11, r11, r0 add r12, r12, r0 @@ -307,12 +405,21 @@ wont_overwrite: /* * Relocate all entries in the GOT table. + * Bump bss entries to _edata + dtb size */ 1: ldr r1, [r11, #0] @ relocate entries in the GOT - add r1, r1, r0 @ table. This fixes up the - str r1, [r11], #4 @ C references. + add r1, r1, r0 @ This fixes up C references + cmp r1, r2 @ if entry >= bss_start && + cmphs r3, r1 @ bss_end > entry + addhi r1, r1, r5 @ entry += dtb size + str r1, [r11], #4 @ next entry cmp r11, r12 blo 1b + + /* bump our bss pointers too */ + add r2, r2, r5 + add r3, r3, r5 + #else /* diff --git a/arch/arm/boot/compressed/libfdt_env.h b/arch/arm/boot/compressed/libfdt_env.h new file mode 100644 index 000000000000..1f4e71876b00 --- /dev/null +++ b/arch/arm/boot/compressed/libfdt_env.h @@ -0,0 +1,15 @@ +#ifndef _ARM_LIBFDT_ENV_H +#define _ARM_LIBFDT_ENV_H + +#include +#include +#include + +#define fdt16_to_cpu(x) be16_to_cpu(x) +#define cpu_to_fdt16(x) cpu_to_be16(x) +#define fdt32_to_cpu(x) be32_to_cpu(x) +#define cpu_to_fdt32(x) cpu_to_be32(x) +#define fdt64_to_cpu(x) be64_to_cpu(x) +#define cpu_to_fdt64(x) cpu_to_be64(x) + +#endif diff --git a/arch/arm/boot/compressed/misc.c b/arch/arm/boot/compressed/misc.c index 832d37236c59..8e2a8fca5ed2 100644 --- a/arch/arm/boot/compressed/misc.c +++ b/arch/arm/boot/compressed/misc.c @@ -18,14 +18,9 @@ unsigned int __machine_arch_type; -#define _LINUX_STRING_H_ - #include /* for inline */ -#include /* for size_t */ -#include /* for NULL */ +#include #include -#include - static void putstr(const char *ptr); extern void error(char *x); @@ -101,41 +96,6 @@ static void putstr(const char *ptr) flush(); } - -void *memcpy(void *__dest, __const void *__src, size_t __n) -{ - int i = 0; - unsigned char *d = (unsigned char *)__dest, *s = (unsigned char *)__src; - - for (i = __n >> 3; i > 0; i--) { - *d++ = *s++; - *d++ = *s++; - *d++ = *s++; - *d++ = *s++; - *d++ = *s++; - *d++ = *s++; - *d++ = *s++; - *d++ = *s++; - } - - if (__n & 1 << 2) { - *d++ = *s++; - *d++ = *s++; - *d++ = *s++; - *d++ = *s++; - } - - if (__n & 1 << 1) { - *d++ = *s++; - *d++ = *s++; - } - - if (__n & 1) - *d++ = *s++; - - return __dest; -} - /* * gzip declarations */ diff --git a/arch/arm/boot/compressed/mmcif-sh7372.c b/arch/arm/boot/compressed/mmcif-sh7372.c index b6f61d9a5a1b..672ae95db5c3 100644 --- a/arch/arm/boot/compressed/mmcif-sh7372.c +++ b/arch/arm/boot/compressed/mmcif-sh7372.c @@ -82,7 +82,7 @@ asmlinkage void mmc_loader(unsigned char *buf, unsigned long len) /* Disable clock to MMC hardware block */ - __raw_writel(__raw_readl(SMSTPCR3) & (1 << 12), SMSTPCR3); + __raw_writel(__raw_readl(SMSTPCR3) | (1 << 12), SMSTPCR3); mmc_update_progress(MMC_PROGRESS_DONE); } diff --git a/arch/arm/boot/compressed/sdhi-sh7372.c b/arch/arm/boot/compressed/sdhi-sh7372.c index d403a8b24d7f..d279294f2381 100644 --- a/arch/arm/boot/compressed/sdhi-sh7372.c +++ b/arch/arm/boot/compressed/sdhi-sh7372.c @@ -85,7 +85,7 @@ asmlinkage void mmc_loader(unsigned short *buf, unsigned long len) goto err; /* Disable clock to SDHI1 hardware block */ - __raw_writel(__raw_readl(SMSTPCR3) & (1 << 13), SMSTPCR3); + __raw_writel(__raw_readl(SMSTPCR3) | (1 << 13), SMSTPCR3); mmc_update_progress(MMC_PROGRESS_DONE); diff --git a/arch/arm/boot/compressed/string.c b/arch/arm/boot/compressed/string.c new file mode 100644 index 000000000000..36e53ef9200f --- /dev/null +++ b/arch/arm/boot/compressed/string.c @@ -0,0 +1,127 @@ +/* + * arch/arm/boot/compressed/string.c + * + * Small subset of simple string routines + */ + +#include + +void *memcpy(void *__dest, __const void *__src, size_t __n) +{ + int i = 0; + unsigned char *d = (unsigned char *)__dest, *s = (unsigned char *)__src; + + for (i = __n >> 3; i > 0; i--) { + *d++ = *s++; + *d++ = *s++; + *d++ = *s++; + *d++ = *s++; + *d++ = *s++; + *d++ = *s++; + *d++ = *s++; + *d++ = *s++; + } + + if (__n & 1 << 2) { + *d++ = *s++; + *d++ = *s++; + *d++ = *s++; + *d++ = *s++; + } + + if (__n & 1 << 1) { + *d++ = *s++; + *d++ = *s++; + } + + if (__n & 1) + *d++ = *s++; + + return __dest; +} + +void *memmove(void *__dest, __const void *__src, size_t count) +{ + unsigned char *d = __dest; + const unsigned char *s = __src; + + if (__dest == __src) + return __dest; + + if (__dest < __src) + return memcpy(__dest, __src, count); + + while (count--) + d[count] = s[count]; + return __dest; +} + +size_t strlen(const char *s) +{ + const char *sc = s; + + while (*sc != '\0') + sc++; + return sc - s; +} + +int memcmp(const void *cs, const void *ct, size_t count) +{ + const unsigned char *su1 = cs, *su2 = ct, *end = su1 + count; + int res = 0; + + while (su1 < end) { + res = *su1++ - *su2++; + if (res) + break; + } + return res; +} + +int strcmp(const char *cs, const char *ct) +{ + unsigned char c1, c2; + int res = 0; + + do { + c1 = *cs++; + c2 = *ct++; + res = c1 - c2; + if (res) + break; + } while (c1); + return res; +} + +void *memchr(const void *s, int c, size_t count) +{ + const unsigned char *p = s; + + while (count--) + if ((unsigned char)c == *p++) + return (void *)(p - 1); + return NULL; +} + +char *strchr(const char *s, int c) +{ + while (*s != (char)c) + if (*s++ == '\0') + return NULL; + return (char *)s; +} + +#undef memset + +void *memset(void *s, int c, size_t count) +{ + char *xs = s; + while (count--) + *xs++ = c; + return s; +} + +void __memzero(void *s, size_t count) +{ + memset(s, 0, count); +} diff --git a/arch/arm/boot/compressed/vmlinux.lds.in b/arch/arm/boot/compressed/vmlinux.lds.in index 4e728834a1b9..4919f2ac8b89 100644 --- a/arch/arm/boot/compressed/vmlinux.lds.in +++ b/arch/arm/boot/compressed/vmlinux.lds.in @@ -51,6 +51,10 @@ SECTIONS _got_start = .; .got : { *(.got) } _got_end = .; + + /* ensure the zImage file size is always a multiple of 64 bits */ + /* (without a dummy byte, ld just ignores the empty section) */ + .pad : { BYTE(0); . = ALIGN(8); } _edata = .; . = BSS_START; diff --git a/arch/arm/include/asm/dma-mapping.h b/arch/arm/include/asm/dma-mapping.h index 7a21d0bf7134..7f27fab9d404 100644 --- a/arch/arm/include/asm/dma-mapping.h +++ b/arch/arm/include/asm/dma-mapping.h @@ -205,6 +205,13 @@ extern void *dma_alloc_writecombine(struct device *, size_t, dma_addr_t *, int dma_mmap_writecombine(struct device *, struct vm_area_struct *, void *, dma_addr_t, size_t); +/* + * This can be called during boot to increase the size of the consistent + * DMA region above it's default value of 2MB. It must be called before the + * memory allocator is initialised, i.e. before any core_initcall. + */ +extern void __init init_consistent_dma_size(unsigned long size); + #ifdef CONFIG_DMABOUNCE /* diff --git a/arch/arm/include/asm/hardware/cache-l2x0.h b/arch/arm/include/asm/hardware/cache-l2x0.h index bfa706ffd968..99a6ed7e1bfd 100644 --- a/arch/arm/include/asm/hardware/cache-l2x0.h +++ b/arch/arm/include/asm/hardware/cache-l2x0.h @@ -45,8 +45,13 @@ #define L2X0_CLEAN_INV_LINE_PA 0x7F0 #define L2X0_CLEAN_INV_LINE_IDX 0x7F8 #define L2X0_CLEAN_INV_WAY 0x7FC -#define L2X0_LOCKDOWN_WAY_D 0x900 -#define L2X0_LOCKDOWN_WAY_I 0x904 +/* + * The lockdown registers repeat 8 times for L310, the L210 has only one + * D and one I lockdown register at 0x0900 and 0x0904. + */ +#define L2X0_LOCKDOWN_WAY_D_BASE 0x900 +#define L2X0_LOCKDOWN_WAY_I_BASE 0x904 +#define L2X0_LOCKDOWN_STRIDE 0x08 #define L2X0_TEST_OPERATION 0xF00 #define L2X0_LINE_DATA 0xF10 #define L2X0_LINE_TAG 0xF30 diff --git a/arch/arm/include/asm/hw_breakpoint.h b/arch/arm/include/asm/hw_breakpoint.h index f389b2704d82..c190bc992f0e 100644 --- a/arch/arm/include/asm/hw_breakpoint.h +++ b/arch/arm/include/asm/hw_breakpoint.h @@ -50,6 +50,7 @@ static inline void decode_ctrl_reg(u32 reg, #define ARM_DEBUG_ARCH_V6_1 2 #define ARM_DEBUG_ARCH_V7_ECP14 3 #define ARM_DEBUG_ARCH_V7_MM 4 +#define ARM_DEBUG_ARCH_V7_1 5 /* Breakpoint */ #define ARM_BREAKPOINT_EXECUTE 0 @@ -57,6 +58,7 @@ static inline void decode_ctrl_reg(u32 reg, /* Watchpoints */ #define ARM_BREAKPOINT_LOAD 1 #define ARM_BREAKPOINT_STORE 2 +#define ARM_FSR_ACCESS_MASK (1 << 11) /* Privilege Levels */ #define ARM_BREAKPOINT_PRIV 1 diff --git a/arch/arm/include/asm/mach/arch.h b/arch/arm/include/asm/mach/arch.h index 217aa1911dd7..727da118bcc1 100644 --- a/arch/arm/include/asm/mach/arch.h +++ b/arch/arm/include/asm/mach/arch.h @@ -17,7 +17,7 @@ struct sys_timer; struct machine_desc { unsigned int nr; /* architecture number */ const char *name; /* architecture name */ - unsigned long boot_params; /* tagged list */ + unsigned long atag_offset; /* tagged list (relative) */ const char **dt_compat; /* array of device tree * 'compatible' strings */ diff --git a/arch/arm/include/asm/memory.h b/arch/arm/include/asm/memory.h index b8de516e600e..652fccca4952 100644 --- a/arch/arm/include/asm/memory.h +++ b/arch/arm/include/asm/memory.h @@ -77,16 +77,7 @@ */ #define IOREMAP_MAX_ORDER 24 -/* - * Size of DMA-consistent memory region. Must be multiple of 2M, - * between 2MB and 14MB inclusive. - */ -#ifndef CONSISTENT_DMA_SIZE -#define CONSISTENT_DMA_SIZE SZ_2M -#endif - #define CONSISTENT_END (0xffe00000UL) -#define CONSISTENT_BASE (CONSISTENT_END - CONSISTENT_DMA_SIZE) #else /* CONFIG_MMU */ diff --git a/arch/arm/include/asm/pmu.h b/arch/arm/include/asm/pmu.h index b7e82c4aced6..71d99b83cdb9 100644 --- a/arch/arm/include/asm/pmu.h +++ b/arch/arm/include/asm/pmu.h @@ -13,7 +13,12 @@ #define __ARM_PMU_H__ #include +#include +/* + * Types of PMUs that can be accessed directly and require mutual + * exclusion between profiling tools. + */ enum arm_pmu_type { ARM_PMU_DEVICE_CPU = 0, ARM_NUM_PMU_DEVICES, @@ -37,21 +42,17 @@ struct arm_pmu_platdata { * reserve_pmu() - reserve the hardware performance counters * * Reserve the hardware performance counters in the system for exclusive use. - * The platform_device for the system is returned on success, ERR_PTR() - * encoded error on failure. + * Returns 0 on success or -EBUSY if the lock is already held. */ -extern struct platform_device * +extern int reserve_pmu(enum arm_pmu_type type); /** * release_pmu() - Relinquish control of the performance counters * * Release the performance counters and allow someone else to use them. - * Callers must have disabled the counters and released IRQs before calling - * this. The platform_device returned from reserve_pmu() must be passed as - * a cookie. */ -extern int +extern void release_pmu(enum arm_pmu_type type); /** @@ -68,24 +69,78 @@ init_pmu(enum arm_pmu_type type); #include -static inline struct platform_device * -reserve_pmu(enum arm_pmu_type type) -{ - return ERR_PTR(-ENODEV); -} - static inline int -release_pmu(enum arm_pmu_type type) +reserve_pmu(enum arm_pmu_type type) { return -ENODEV; } -static inline int -init_pmu(enum arm_pmu_type type) -{ - return -ENODEV; -} +static inline void +release_pmu(enum arm_pmu_type type) { } #endif /* CONFIG_CPU_HAS_PMU */ +#ifdef CONFIG_HW_PERF_EVENTS + +/* The events for a given PMU register set. */ +struct pmu_hw_events { + /* + * The events that are active on the PMU for the given index. + */ + struct perf_event **events; + + /* + * A 1 bit for an index indicates that the counter is being used for + * an event. A 0 means that the counter can be used. + */ + unsigned long *used_mask; + + /* + * Hardware lock to serialize accesses to PMU registers. Needed for the + * read/modify/write sequences. + */ + raw_spinlock_t pmu_lock; +}; + +struct arm_pmu { + struct pmu pmu; + enum arm_perf_pmu_ids id; + enum arm_pmu_type type; + cpumask_t active_irqs; + const char *name; + irqreturn_t (*handle_irq)(int irq_num, void *dev); + void (*enable)(struct hw_perf_event *evt, int idx); + void (*disable)(struct hw_perf_event *evt, int idx); + int (*get_event_idx)(struct pmu_hw_events *hw_events, + struct hw_perf_event *hwc); + int (*set_event_filter)(struct hw_perf_event *evt, + struct perf_event_attr *attr); + u32 (*read_counter)(int idx); + void (*write_counter)(int idx, u32 val); + void (*start)(void); + void (*stop)(void); + void (*reset)(void *); + int (*map_event)(struct perf_event *event); + int num_events; + atomic_t active_events; + struct mutex reserve_mutex; + u64 max_period; + struct platform_device *plat_device; + struct pmu_hw_events *(*get_hw_events)(void); +}; + +#define to_arm_pmu(p) (container_of(p, struct arm_pmu, pmu)) + +int __init armpmu_register(struct arm_pmu *armpmu, char *name, int type); + +u64 armpmu_event_update(struct perf_event *event, + struct hw_perf_event *hwc, + int idx, int overflow); + +int armpmu_event_set_period(struct perf_event *event, + struct hw_perf_event *hwc, + int idx); + +#endif /* CONFIG_HW_PERF_EVENTS */ + #endif /* __ARM_PMU_H__ */ diff --git a/arch/arm/kernel/Makefile b/arch/arm/kernel/Makefile index 787b88823a56..8fa83f54c967 100644 --- a/arch/arm/kernel/Makefile +++ b/arch/arm/kernel/Makefile @@ -43,6 +43,13 @@ obj-$(CONFIG_KPROBES) += kprobes-thumb.o else obj-$(CONFIG_KPROBES) += kprobes-arm.o endif +obj-$(CONFIG_ARM_KPROBES_TEST) += test-kprobes.o +test-kprobes-objs := kprobes-test.o +ifdef CONFIG_THUMB2_KERNEL +test-kprobes-objs += kprobes-test-thumb.o +else +test-kprobes-objs += kprobes-test-arm.o +endif obj-$(CONFIG_ATAGS_PROC) += atags.o obj-$(CONFIG_OABI_COMPAT) += sys_oabi-compat.o obj-$(CONFIG_ARM_THUMBEE) += thumbee.o diff --git a/arch/arm/kernel/calls.S b/arch/arm/kernel/calls.S index 80f7896cc016..9943e9e74a1b 100644 --- a/arch/arm/kernel/calls.S +++ b/arch/arm/kernel/calls.S @@ -178,7 +178,7 @@ CALL(sys_ni_syscall) /* vm86 */ CALL(sys_ni_syscall) /* was sys_query_module */ CALL(sys_poll) - CALL(sys_nfsservctl) + CALL(sys_ni_syscall) /* was nfsservctl */ /* 170 */ CALL(sys_setresgid16) CALL(sys_getresgid16) CALL(sys_prctl) diff --git a/arch/arm/kernel/hw_breakpoint.c b/arch/arm/kernel/hw_breakpoint.c index a927ca1f5566..5a46225f007e 100644 --- a/arch/arm/kernel/hw_breakpoint.c +++ b/arch/arm/kernel/hw_breakpoint.c @@ -45,7 +45,6 @@ static DEFINE_PER_CPU(struct perf_event *, wp_on_reg[ARM_MAX_WRP]); /* Number of BRP/WRP registers on this CPU. */ static int core_num_brps; -static int core_num_reserved_brps; static int core_num_wrps; /* Debug architecture version. */ @@ -137,10 +136,11 @@ static u8 get_debug_arch(void) u32 didr; /* Do we implement the extended CPUID interface? */ - if (WARN_ONCE((((read_cpuid_id() >> 16) & 0xf) != 0xf), - "CPUID feature registers not supported. " - "Assuming v6 debug is present.\n")) + if (((read_cpuid_id() >> 16) & 0xf) != 0xf) { + pr_warning("CPUID feature registers not supported. " + "Assuming v6 debug is present.\n"); return ARM_DEBUG_ARCH_V6; + } ARM_DBG_READ(c0, 0, didr); return (didr >> 16) & 0xf; @@ -154,10 +154,21 @@ u8 arch_get_debug_arch(void) static int debug_arch_supported(void) { u8 arch = get_debug_arch(); - return arch >= ARM_DEBUG_ARCH_V6 && arch <= ARM_DEBUG_ARCH_V7_ECP14; + + /* We don't support the memory-mapped interface. */ + return (arch >= ARM_DEBUG_ARCH_V6 && arch <= ARM_DEBUG_ARCH_V7_ECP14) || + arch >= ARM_DEBUG_ARCH_V7_1; +} + +/* Determine number of WRP registers available. */ +static int get_num_wrp_resources(void) +{ + u32 didr; + ARM_DBG_READ(c0, 0, didr); + return ((didr >> 28) & 0xf) + 1; } -/* Determine number of BRP register available. */ +/* Determine number of BRP registers available. */ static int get_num_brp_resources(void) { u32 didr; @@ -176,9 +187,10 @@ static int core_has_mismatch_brps(void) static int get_num_wrps(void) { /* - * FIXME: When a watchpoint fires, the only way to work out which - * watchpoint it was is by disassembling the faulting instruction - * and working out the address of the memory access. + * On debug architectures prior to 7.1, when a watchpoint fires, the + * only way to work out which watchpoint it was is by disassembling + * the faulting instruction and working out the address of the memory + * access. * * Furthermore, we can only do this if the watchpoint was precise * since imprecise watchpoints prevent us from calculating register @@ -192,36 +204,17 @@ static int get_num_wrps(void) * [the ARM ARM states that the DFAR is UNKNOWN, but experience shows * that it is set on some implementations]. */ + if (get_debug_arch() < ARM_DEBUG_ARCH_V7_1) + return 1; -#if 0 - int wrps; - u32 didr; - ARM_DBG_READ(c0, 0, didr); - wrps = ((didr >> 28) & 0xf) + 1; -#endif - int wrps = 1; - - if (core_has_mismatch_brps() && wrps >= get_num_brp_resources()) - wrps = get_num_brp_resources() - 1; - - return wrps; -} - -/* We reserve one breakpoint for each watchpoint. */ -static int get_num_reserved_brps(void) -{ - if (core_has_mismatch_brps()) - return get_num_wrps(); - return 0; + return get_num_wrp_resources(); } /* Determine number of usable BRPs available. */ static int get_num_brps(void) { int brps = get_num_brp_resources(); - if (core_has_mismatch_brps()) - brps -= get_num_reserved_brps(); - return brps; + return core_has_mismatch_brps() ? brps - 1 : brps; } /* @@ -239,7 +232,7 @@ static int enable_monitor_mode(void) /* Ensure that halting mode is disabled. */ if (WARN_ONCE(dscr & ARM_DSCR_HDBGEN, - "halting debug mode enabled. Unable to access hardware resources.\n")) { + "halting debug mode enabled. Unable to access hardware resources.\n")) { ret = -EPERM; goto out; } @@ -255,6 +248,7 @@ static int enable_monitor_mode(void) ARM_DBG_WRITE(c1, 0, (dscr | ARM_DSCR_MDBGEN)); break; case ARM_DEBUG_ARCH_V7_ECP14: + case ARM_DEBUG_ARCH_V7_1: ARM_DBG_WRITE(c2, 2, (dscr | ARM_DSCR_MDBGEN)); break; default: @@ -346,24 +340,10 @@ int arch_install_hw_breakpoint(struct perf_event *bp) val_base = ARM_BASE_BVR; slots = (struct perf_event **)__get_cpu_var(bp_on_reg); max_slots = core_num_brps; - if (info->step_ctrl.enabled) { - /* Override the breakpoint data with the step data. */ - addr = info->trigger & ~0x3; - ctrl = encode_ctrl_reg(info->step_ctrl); - } } else { /* Watchpoint */ - if (info->step_ctrl.enabled) { - /* Install into the reserved breakpoint region. */ - ctrl_base = ARM_BASE_BCR + core_num_brps; - val_base = ARM_BASE_BVR + core_num_brps; - /* Override the watchpoint data with the step data. */ - addr = info->trigger & ~0x3; - ctrl = encode_ctrl_reg(info->step_ctrl); - } else { - ctrl_base = ARM_BASE_WCR; - val_base = ARM_BASE_WVR; - } + ctrl_base = ARM_BASE_WCR; + val_base = ARM_BASE_WVR; slots = (struct perf_event **)__get_cpu_var(wp_on_reg); max_slots = core_num_wrps; } @@ -382,6 +362,17 @@ int arch_install_hw_breakpoint(struct perf_event *bp) goto out; } + /* Override the breakpoint data with the step data. */ + if (info->step_ctrl.enabled) { + addr = info->trigger & ~0x3; + ctrl = encode_ctrl_reg(info->step_ctrl); + if (info->ctrl.type != ARM_BREAKPOINT_EXECUTE) { + i = 0; + ctrl_base = ARM_BASE_BCR + core_num_brps; + val_base = ARM_BASE_BVR + core_num_brps; + } + } + /* Setup the address register. */ write_wb_reg(val_base + i, addr); @@ -405,10 +396,7 @@ void arch_uninstall_hw_breakpoint(struct perf_event *bp) max_slots = core_num_brps; } else { /* Watchpoint */ - if (info->step_ctrl.enabled) - base = ARM_BASE_BCR + core_num_brps; - else - base = ARM_BASE_WCR; + base = ARM_BASE_WCR; slots = (struct perf_event **)__get_cpu_var(wp_on_reg); max_slots = core_num_wrps; } @@ -426,6 +414,13 @@ void arch_uninstall_hw_breakpoint(struct perf_event *bp) if (WARN_ONCE(i == max_slots, "Can't find any breakpoint slot\n")) return; + /* Ensure that we disable the mismatch breakpoint. */ + if (info->ctrl.type != ARM_BREAKPOINT_EXECUTE && + info->step_ctrl.enabled) { + i = 0; + base = ARM_BASE_BCR + core_num_brps; + } + /* Reset the control register. */ write_wb_reg(base + i, 0); } @@ -632,10 +627,9 @@ int arch_validate_hwbkpt_settings(struct perf_event *bp) * we can use the mismatch feature as a poor-man's hardware * single-step, but this only works for per-task breakpoints. */ - if (WARN_ONCE(!bp->overflow_handler && - (arch_check_bp_in_kernelspace(bp) || !core_has_mismatch_brps() - || !bp->hw.bp_target), - "overflow handler required but none found\n")) { + if (!bp->overflow_handler && (arch_check_bp_in_kernelspace(bp) || + !core_has_mismatch_brps() || !bp->hw.bp_target)) { + pr_warning("overflow handler required but none found\n"); ret = -EINVAL; } out: @@ -666,34 +660,62 @@ static void disable_single_step(struct perf_event *bp) arch_install_hw_breakpoint(bp); } -static void watchpoint_handler(unsigned long unknown, struct pt_regs *regs) +static void watchpoint_handler(unsigned long addr, unsigned int fsr, + struct pt_regs *regs) { - int i; + int i, access; + u32 val, ctrl_reg, alignment_mask; struct perf_event *wp, **slots; struct arch_hw_breakpoint *info; + struct arch_hw_breakpoint_ctrl ctrl; slots = (struct perf_event **)__get_cpu_var(wp_on_reg); - /* Without a disassembler, we can only handle 1 watchpoint. */ - BUG_ON(core_num_wrps > 1); - for (i = 0; i < core_num_wrps; ++i) { rcu_read_lock(); wp = slots[i]; - if (wp == NULL) { - rcu_read_unlock(); - continue; - } + if (wp == NULL) + goto unlock; + info = counter_arch_bp(wp); /* - * The DFAR is an unknown value. Since we only allow a - * single watchpoint, we can set the trigger to the lowest - * possible faulting address. + * The DFAR is an unknown value on debug architectures prior + * to 7.1. Since we only allow a single watchpoint on these + * older CPUs, we can set the trigger to the lowest possible + * faulting address. */ - info = counter_arch_bp(wp); - info->trigger = wp->attr.bp_addr; + if (debug_arch < ARM_DEBUG_ARCH_V7_1) { + BUG_ON(i > 0); + info->trigger = wp->attr.bp_addr; + } else { + if (info->ctrl.len == ARM_BREAKPOINT_LEN_8) + alignment_mask = 0x7; + else + alignment_mask = 0x3; + + /* Check if the watchpoint value matches. */ + val = read_wb_reg(ARM_BASE_WVR + i); + if (val != (addr & ~alignment_mask)) + goto unlock; + + /* Possible match, check the byte address select. */ + ctrl_reg = read_wb_reg(ARM_BASE_WCR + i); + decode_ctrl_reg(ctrl_reg, &ctrl); + if (!((1 << (addr & alignment_mask)) & ctrl.len)) + goto unlock; + + /* Check that the access type matches. */ + access = (fsr & ARM_FSR_ACCESS_MASK) ? HW_BREAKPOINT_W : + HW_BREAKPOINT_R; + if (!(access & hw_breakpoint_type(wp))) + goto unlock; + + /* We have a winner. */ + info->trigger = addr; + } + pr_debug("watchpoint fired: address = 0x%x\n", info->trigger); perf_bp_event(wp, regs); @@ -705,6 +727,7 @@ static void watchpoint_handler(unsigned long unknown, struct pt_regs *regs) if (!wp->overflow_handler) enable_single_step(wp, instruction_pointer(regs)); +unlock: rcu_read_unlock(); } } @@ -717,7 +740,7 @@ static void watchpoint_single_step_handler(unsigned long pc) slots = (struct perf_event **)__get_cpu_var(wp_on_reg); - for (i = 0; i < core_num_reserved_brps; ++i) { + for (i = 0; i < core_num_wrps; ++i) { rcu_read_lock(); wp = slots[i]; @@ -820,7 +843,7 @@ static int hw_breakpoint_pending(unsigned long addr, unsigned int fsr, case ARM_ENTRY_ASYNC_WATCHPOINT: WARN(1, "Asynchronous watchpoint exception taken. Debugging results may be unreliable\n"); case ARM_ENTRY_SYNC_WATCHPOINT: - watchpoint_handler(addr, regs); + watchpoint_handler(addr, fsr, regs); break; default: ret = 1; /* Unhandled fault. */ @@ -834,11 +857,31 @@ static int hw_breakpoint_pending(unsigned long addr, unsigned int fsr, /* * One-time initialisation. */ -static void reset_ctrl_regs(void *info) +static cpumask_t debug_err_mask; + +static int debug_reg_trap(struct pt_regs *regs, unsigned int instr) { - int i, cpu = smp_processor_id(); + int cpu = smp_processor_id(); + + pr_warning("Debug register access (0x%x) caused undefined instruction on CPU %d\n", + instr, cpu); + + /* Set the error flag for this CPU and skip the faulting instruction. */ + cpumask_set_cpu(cpu, &debug_err_mask); + instruction_pointer(regs) += 4; + return 0; +} + +static struct undef_hook debug_reg_hook = { + .instr_mask = 0x0fe80f10, + .instr_val = 0x0e000e10, + .fn = debug_reg_trap, +}; + +static void reset_ctrl_regs(void *unused) +{ + int i, raw_num_brps, err = 0, cpu = smp_processor_id(); u32 dbg_power; - cpumask_t *cpumask = info; /* * v7 debug contains save and restore registers so that debug state @@ -848,38 +891,52 @@ static void reset_ctrl_regs(void *info) * Access Register to avoid taking undefined instruction exceptions * later on. */ - if (debug_arch >= ARM_DEBUG_ARCH_V7_ECP14) { + switch (debug_arch) { + case ARM_DEBUG_ARCH_V7_ECP14: /* * Ensure sticky power-down is clear (i.e. debug logic is * powered up). */ asm volatile("mrc p14, 0, %0, c1, c5, 4" : "=r" (dbg_power)); - if ((dbg_power & 0x1) == 0) { - pr_warning("CPU %d debug is powered down!\n", cpu); - cpumask_or(cpumask, cpumask, cpumask_of(cpu)); - return; - } - + if ((dbg_power & 0x1) == 0) + err = -EPERM; + break; + case ARM_DEBUG_ARCH_V7_1: /* - * Unconditionally clear the lock by writing a value - * other than 0xC5ACCE55 to the access register. + * Ensure the OS double lock is clear. */ - asm volatile("mcr p14, 0, %0, c1, c0, 4" : : "r" (0)); - isb(); + asm volatile("mrc p14, 0, %0, c1, c3, 4" : "=r" (dbg_power)); + if ((dbg_power & 0x1) == 1) + err = -EPERM; + break; + } - /* - * Clear any configured vector-catch events before - * enabling monitor mode. - */ - asm volatile("mcr p14, 0, %0, c0, c7, 0" : : "r" (0)); - isb(); + if (err) { + pr_warning("CPU %d debug is powered down!\n", cpu); + cpumask_or(&debug_err_mask, &debug_err_mask, cpumask_of(cpu)); + return; } + /* + * Unconditionally clear the lock by writing a value + * other than 0xC5ACCE55 to the access register. + */ + asm volatile("mcr p14, 0, %0, c1, c0, 4" : : "r" (0)); + isb(); + + /* + * Clear any configured vector-catch events before + * enabling monitor mode. + */ + asm volatile("mcr p14, 0, %0, c0, c7, 0" : : "r" (0)); + isb(); + if (enable_monitor_mode()) return; /* We must also reset any reserved registers. */ - for (i = 0; i < core_num_brps + core_num_reserved_brps; ++i) { + raw_num_brps = get_num_brp_resources(); + for (i = 0; i < raw_num_brps; ++i) { write_wb_reg(ARM_BASE_BCR + i, 0UL); write_wb_reg(ARM_BASE_BVR + i, 0UL); } @@ -895,6 +952,7 @@ static int __cpuinit dbg_reset_notify(struct notifier_block *self, { if (action == CPU_ONLINE) smp_call_function_single((int)cpu, reset_ctrl_regs, NULL, 1); + return NOTIFY_OK; } @@ -905,7 +963,6 @@ static struct notifier_block __cpuinitdata dbg_reset_nb = { static int __init arch_hw_breakpoint_init(void) { u32 dscr; - cpumask_t cpumask = { CPU_BITS_NONE }; debug_arch = get_debug_arch(); @@ -916,28 +973,31 @@ static int __init arch_hw_breakpoint_init(void) /* Determine how many BRPs/WRPs are available. */ core_num_brps = get_num_brps(); - core_num_reserved_brps = get_num_reserved_brps(); core_num_wrps = get_num_wrps(); - pr_info("found %d breakpoint and %d watchpoint registers.\n", - core_num_brps + core_num_reserved_brps, core_num_wrps); - - if (core_num_reserved_brps) - pr_info("%d breakpoint(s) reserved for watchpoint " - "single-step.\n", core_num_reserved_brps); + /* + * We need to tread carefully here because DBGSWENABLE may be + * driven low on this core and there isn't an architected way to + * determine that. + */ + register_undef_hook(&debug_reg_hook); /* * Reset the breakpoint resources. We assume that a halting * debugger will leave the world in a nice state for us. */ - on_each_cpu(reset_ctrl_regs, &cpumask, 1); - if (!cpumask_empty(&cpumask)) { + on_each_cpu(reset_ctrl_regs, NULL, 1); + unregister_undef_hook(&debug_reg_hook); + if (!cpumask_empty(&debug_err_mask)) { core_num_brps = 0; - core_num_reserved_brps = 0; core_num_wrps = 0; return 0; } + pr_info("found %d " "%s" "breakpoint and %d watchpoint registers.\n", + core_num_brps, core_has_mismatch_brps() ? "(+1 reserved) " : + "", core_num_wrps); + ARM_DBG_READ(c1, 0, dscr); if (dscr & ARM_DSCR_HDBGEN) { max_watchpoint_len = 4; diff --git a/arch/arm/kernel/kprobes-arm.c b/arch/arm/kernel/kprobes-arm.c index 79203ee1d039..9fe8910308af 100644 --- a/arch/arm/kernel/kprobes-arm.c +++ b/arch/arm/kernel/kprobes-arm.c @@ -60,6 +60,7 @@ #include #include +#include #include "kprobes.h" @@ -971,6 +972,9 @@ const union decode_item kprobe_decode_arm_table[] = { DECODE_END }; +#ifdef CONFIG_ARM_KPROBES_TEST_MODULE +EXPORT_SYMBOL_GPL(kprobe_decode_arm_table); +#endif static void __kprobes arm_singlestep(struct kprobe *p, struct pt_regs *regs) { diff --git a/arch/arm/kernel/kprobes-test-arm.c b/arch/arm/kernel/kprobes-test-arm.c new file mode 100644 index 000000000000..fc82de8bdcce --- /dev/null +++ b/arch/arm/kernel/kprobes-test-arm.c @@ -0,0 +1,1323 @@ +/* + * arch/arm/kernel/kprobes-test-arm.c + * + * Copyright (C) 2011 Jon Medhurst . + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License version 2 as + * published by the Free Software Foundation. + */ + +#include +#include + +#include "kprobes-test.h" + + +#define TEST_ISA "32" + +#define TEST_ARM_TO_THUMB_INTERWORK_R(code1, reg, val, code2) \ + TESTCASE_START(code1 #reg code2) \ + TEST_ARG_REG(reg, val) \ + TEST_ARG_REG(14, 99f) \ + TEST_ARG_END("") \ + "50: nop \n\t" \ + "1: "code1 #reg code2" \n\t" \ + " bx lr \n\t" \ + ".thumb \n\t" \ + "3: adr lr, 2f \n\t" \ + " bx lr \n\t" \ + ".arm \n\t" \ + "2: nop \n\t" \ + TESTCASE_END + +#define TEST_ARM_TO_THUMB_INTERWORK_P(code1, reg, val, code2) \ + TESTCASE_START(code1 #reg code2) \ + TEST_ARG_PTR(reg, val) \ + TEST_ARG_REG(14, 99f) \ + TEST_ARG_MEM(15, 3f+1) \ + TEST_ARG_END("") \ + "50: nop \n\t" \ + "1: "code1 #reg code2" \n\t" \ + " bx lr \n\t" \ + ".thumb \n\t" \ + "3: adr lr, 2f \n\t" \ + " bx lr \n\t" \ + ".arm \n\t" \ + "2: nop \n\t" \ + TESTCASE_END + + +void kprobe_arm_test_cases(void) +{ + kprobe_test_flags = 0; + + TEST_GROUP("Data-processing (register), (register-shifted register), (immediate)") + +#define _DATA_PROCESSING_DNM(op,s,val) \ + TEST_RR( op "eq" s " r0, r",1, VAL1,", r",2, val, "") \ + TEST_RR( op "ne" s " r1, r",1, VAL1,", r",2, val, ", lsl #3") \ + TEST_RR( op "cs" s " r2, r",3, VAL1,", r",2, val, ", lsr #4") \ + TEST_RR( op "cc" s " r3, r",3, VAL1,", r",2, val, ", asr #5") \ + TEST_RR( op "mi" s " r4, r",5, VAL1,", r",2, N(val),", asr #6") \ + TEST_RR( op "pl" s " r5, r",5, VAL1,", r",2, val, ", ror #7") \ + TEST_RR( op "vs" s " r6, r",7, VAL1,", r",2, val, ", rrx") \ + TEST_R( op "vc" s " r6, r",7, VAL1,", pc, lsl #3") \ + TEST_R( op "vc" s " r6, r",7, VAL1,", sp, lsr #4") \ + TEST_R( op "vc" s " r6, pc, r",7, VAL1,", asr #5") \ + TEST_R( op "vc" s " r6, sp, r",7, VAL1,", ror #6") \ + TEST_RRR( op "hi" s " r8, r",9, VAL1,", r",14,val, ", lsl r",0, 3,"")\ + TEST_RRR( op "ls" s " r9, r",9, VAL1,", r",14,val, ", lsr r",7, 4,"")\ + TEST_RRR( op "ge" s " r10, r",11,VAL1,", r",14,val, ", asr r",7, 5,"")\ + TEST_RRR( op "lt" s " r11, r",11,VAL1,", r",14,N(val),", asr r",7, 6,"")\ + TEST_RR( op "gt" s " r12, r13" ", r",14,val, ", ror r",14,7,"")\ + TEST_RR( op "le" s " r14, r",0, val, ", r13" ", lsl r",14,8,"")\ + TEST_RR( op s " r12, pc" ", r",14,val, ", ror r",14,7,"")\ + TEST_RR( op s " r14, r",0, val, ", pc" ", lsl r",14,8,"")\ + TEST_R( op "eq" s " r0, r",11,VAL1,", #0xf5") \ + TEST_R( op "ne" s " r11, r",0, VAL1,", #0xf5000000") \ + TEST_R( op s " r7, r",8, VAL2,", #0x000af000") \ + TEST( op s " r4, pc" ", #0x00005a00") + +#define DATA_PROCESSING_DNM(op,val) \ + _DATA_PROCESSING_DNM(op,"",val) \ + _DATA_PROCESSING_DNM(op,"s",val) + +#define DATA_PROCESSING_NM(op,val) \ + TEST_RR( op "ne r",1, VAL1,", r",2, val, "") \ + TEST_RR( op "eq r",1, VAL1,", r",2, val, ", lsl #3") \ + TEST_RR( op "cc r",3, VAL1,", r",2, val, ", lsr #4") \ + TEST_RR( op "cs r",3, VAL1,", r",2, val, ", asr #5") \ + TEST_RR( op "pl r",5, VAL1,", r",2, N(val),", asr #6") \ + TEST_RR( op "mi r",5, VAL1,", r",2, val, ", ror #7") \ + TEST_RR( op "vc r",7, VAL1,", r",2, val, ", rrx") \ + TEST_R ( op "vs r",7, VAL1,", pc, lsl #3") \ + TEST_R ( op "vs r",7, VAL1,", sp, lsr #4") \ + TEST_R( op "vs pc, r",7, VAL1,", asr #5") \ + TEST_R( op "vs sp, r",7, VAL1,", ror #6") \ + TEST_RRR( op "ls r",9, VAL1,", r",14,val, ", lsl r",0, 3,"") \ + TEST_RRR( op "hi r",9, VAL1,", r",14,val, ", lsr r",7, 4,"") \ + TEST_RRR( op "lt r",11,VAL1,", r",14,val, ", asr r",7, 5,"") \ + TEST_RRR( op "ge r",11,VAL1,", r",14,N(val),", asr r",7, 6,"") \ + TEST_RR( op "le r13" ", r",14,val, ", ror r",14,7,"") \ + TEST_RR( op "gt r",0, val, ", r13" ", lsl r",14,8,"") \ + TEST_RR( op " pc" ", r",14,val, ", ror r",14,7,"") \ + TEST_RR( op " r",0, val, ", pc" ", lsl r",14,8,"") \ + TEST_R( op "eq r",11,VAL1,", #0xf5") \ + TEST_R( op "ne r",0, VAL1,", #0xf5000000") \ + TEST_R( op " r",8, VAL2,", #0x000af000") + +#define _DATA_PROCESSING_DM(op,s,val) \ + TEST_R( op "eq" s " r0, r",1, val, "") \ + TEST_R( op "ne" s " r1, r",1, val, ", lsl #3") \ + TEST_R( op "cs" s " r2, r",3, val, ", lsr #4") \ + TEST_R( op "cc" s " r3, r",3, val, ", asr #5") \ + TEST_R( op "mi" s " r4, r",5, N(val),", asr #6") \ + TEST_R( op "pl" s " r5, r",5, val, ", ror #7") \ + TEST_R( op "vs" s " r6, r",10,val, ", rrx") \ + TEST( op "vs" s " r7, pc, lsl #3") \ + TEST( op "vs" s " r7, sp, lsr #4") \ + TEST_RR( op "vc" s " r8, r",7, val, ", lsl r",0, 3,"") \ + TEST_RR( op "hi" s " r9, r",9, val, ", lsr r",7, 4,"") \ + TEST_RR( op "ls" s " r10, r",9, val, ", asr r",7, 5,"") \ + TEST_RR( op "ge" s " r11, r",11,N(val),", asr r",7, 6,"") \ + TEST_RR( op "lt" s " r12, r",11,val, ", ror r",14,7,"") \ + TEST_R( op "gt" s " r14, r13" ", lsl r",14,8,"") \ + TEST_R( op "le" s " r14, pc" ", lsl r",14,8,"") \ + TEST( op "eq" s " r0, #0xf5") \ + TEST( op "ne" s " r11, #0xf5000000") \ + TEST( op s " r7, #0x000af000") \ + TEST( op s " r4, #0x00005a00") + +#define DATA_PROCESSING_DM(op,val) \ + _DATA_PROCESSING_DM(op,"",val) \ + _DATA_PROCESSING_DM(op,"s",val) + + DATA_PROCESSING_DNM("and",0xf00f00ff) + DATA_PROCESSING_DNM("eor",0xf00f00ff) + DATA_PROCESSING_DNM("sub",VAL2) + DATA_PROCESSING_DNM("rsb",VAL2) + DATA_PROCESSING_DNM("add",VAL2) + DATA_PROCESSING_DNM("adc",VAL2) + DATA_PROCESSING_DNM("sbc",VAL2) + DATA_PROCESSING_DNM("rsc",VAL2) + DATA_PROCESSING_NM("tst",0xf00f00ff) + DATA_PROCESSING_NM("teq",0xf00f00ff) + DATA_PROCESSING_NM("cmp",VAL2) + DATA_PROCESSING_NM("cmn",VAL2) + DATA_PROCESSING_DNM("orr",0xf00f00ff) + DATA_PROCESSING_DM("mov",VAL2) + DATA_PROCESSING_DNM("bic",0xf00f00ff) + DATA_PROCESSING_DM("mvn",VAL2) + + TEST("mov ip, sp") /* This has special case emulation code */ + + TEST_SUPPORTED("mov pc, #0x1000"); + TEST_SUPPORTED("mov sp, #0x1000"); + TEST_SUPPORTED("cmp pc, #0x1000"); + TEST_SUPPORTED("cmp sp, #0x1000"); + + /* Data-processing with PC as shift*/ + TEST_UNSUPPORTED(".word 0xe15c0f1e @ cmp r12, r14, asl pc") + TEST_UNSUPPORTED(".word 0xe1a0cf1e @ mov r12, r14, asl pc") + TEST_UNSUPPORTED(".word 0xe08caf1e @ add r10, r12, r14, asl pc") + + /* Data-processing with PC as shift*/ + TEST_UNSUPPORTED("movs pc, r1") + TEST_UNSUPPORTED("movs pc, r1, lsl r2") + TEST_UNSUPPORTED("movs pc, #0x10000") + TEST_UNSUPPORTED("adds pc, lr, r1") + TEST_UNSUPPORTED("adds pc, lr, r1, lsl r2") + TEST_UNSUPPORTED("adds pc, lr, #4") + + /* Data-processing with SP as target */ + TEST("add sp, sp, #16") + TEST("sub sp, sp, #8") + TEST("bic sp, sp, #0x20") + TEST("orr sp, sp, #0x20") + TEST_PR( "add sp, r",10,0,", r",11,4,"") + TEST_PRR("add sp, r",10,0,", r",11,4,", asl r",12,1,"") + TEST_P( "mov sp, r",10,0,"") + TEST_PR( "mov sp, r",10,0,", asl r",12,0,"") + + /* Data-processing with PC as target */ + TEST_BF( "add pc, pc, #2f-1b-8") + TEST_BF_R ("add pc, pc, r",14,2f-1f-8,"") + TEST_BF_R ("add pc, r",14,2f-1f-8,", pc") + TEST_BF_R ("mov pc, r",0,2f,"") + TEST_BF_RR("mov pc, r",0,2f,", asl r",1,0,"") + TEST_BB( "sub pc, pc, #1b-2b+8") +#if __LINUX_ARM_ARCH__ >= 6 + TEST_BB( "sub pc, pc, #1b-2b+8-2") /* UNPREDICTABLE before ARMv6 */ +#endif + TEST_BB_R( "sub pc, pc, r",14, 1f-2f+8,"") + TEST_BB_R( "rsb pc, r",14,1f-2f+8,", pc") + TEST_RR( "add pc, pc, r",10,-2,", asl r",11,1,"") +#ifdef CONFIG_THUMB2_KERNEL + TEST_ARM_TO_THUMB_INTERWORK_R("add pc, pc, r",0,3f-1f-8+1,"") + TEST_ARM_TO_THUMB_INTERWORK_R("sub pc, r",0,3f+8+1,", #8") +#endif + TEST_GROUP("Miscellaneous instructions") + + TEST("mrs r0, cpsr") + TEST("mrspl r7, cpsr") + TEST("mrs r14, cpsr") + TEST_UNSUPPORTED(".word 0xe10ff000 @ mrs r15, cpsr") + TEST_UNSUPPORTED("mrs r0, spsr") + TEST_UNSUPPORTED("mrs lr, spsr") + + TEST_UNSUPPORTED("msr cpsr, r0") + TEST_UNSUPPORTED("msr cpsr_f, lr") + TEST_UNSUPPORTED("msr spsr, r0") + + TEST_BF_R("bx r",0,2f,"") + TEST_BB_R("bx r",7,2f,"") + TEST_BF_R("bxeq r",14,2f,"") + + TEST_R("clz r0, r",0, 0x0,"") + TEST_R("clzeq r7, r",14,0x1,"") + TEST_R("clz lr, r",7, 0xffffffff,"") + TEST( "clz r4, sp") + TEST_UNSUPPORTED(".word 0x016fff10 @ clz pc, r0") + TEST_UNSUPPORTED(".word 0x016f0f1f @ clz r0, pc") + +#if __LINUX_ARM_ARCH__ >= 6 + TEST_UNSUPPORTED("bxj r0") +#endif + + TEST_BF_R("blx r",0,2f,"") + TEST_BB_R("blx r",7,2f,"") + TEST_BF_R("blxeq r",14,2f,"") + TEST_UNSUPPORTED(".word 0x0120003f @ blx pc") + + TEST_RR( "qadd r0, r",1, VAL1,", r",2, VAL2,"") + TEST_RR( "qaddvs lr, r",9, VAL2,", r",8, VAL1,"") + TEST_R( "qadd lr, r",9, VAL2,", r13") + TEST_RR( "qsub r0, r",1, VAL1,", r",2, VAL2,"") + TEST_RR( "qsubvs lr, r",9, VAL2,", r",8, VAL1,"") + TEST_R( "qsub lr, r",9, VAL2,", r13") + TEST_RR( "qdadd r0, r",1, VAL1,", r",2, VAL2,"") + TEST_RR( "qdaddvs lr, r",9, VAL2,", r",8, VAL1,"") + TEST_R( "qdadd lr, r",9, VAL2,", r13") + TEST_RR( "qdsub r0, r",1, VAL1,", r",2, VAL2,"") + TEST_RR( "qdsubvs lr, r",9, VAL2,", r",8, VAL1,"") + TEST_R( "qdsub lr, r",9, VAL2,", r13") + TEST_UNSUPPORTED(".word 0xe101f050 @ qadd pc, r0, r1") + TEST_UNSUPPORTED(".word 0xe121f050 @ qsub pc, r0, r1") + TEST_UNSUPPORTED(".word 0xe141f050 @ qdadd pc, r0, r1") + TEST_UNSUPPORTED(".word 0xe161f050 @ qdsub pc, r0, r1") + TEST_UNSUPPORTED(".word 0xe16f2050 @ qdsub r2, r0, pc") + TEST_UNSUPPORTED(".word 0xe161205f @ qdsub r2, pc, r1") + + TEST_UNSUPPORTED("bkpt 0xffff") + TEST_UNSUPPORTED("bkpt 0x0000") + + TEST_UNSUPPORTED(".word 0xe1600070 @ smc #0") + + TEST_GROUP("Halfword multiply and multiply-accumulate") + + TEST_RRR( "smlabb r0, r",1, VAL1,", r",2, VAL2,", r",3, VAL3,"") + TEST_RRR( "smlabbge r7, r",8, VAL3,", r",9, VAL1,", r",10, VAL2,"") + TEST_RR( "smlabb lr, r",1, VAL2,", r",2, VAL3,", r13") + TEST_UNSUPPORTED(".word 0xe10f3281 @ smlabb pc, r1, r2, r3") + TEST_RRR( "smlatb r0, r",1, VAL1,", r",2, VAL2,", r",3, VAL3,"") + TEST_RRR( "smlatbge r7, r",8, VAL3,", r",9, VAL1,", r",10, VAL2,"") + TEST_RR( "smlatb lr, r",1, VAL2,", r",2, VAL3,", r13") + TEST_UNSUPPORTED(".word 0xe10f32a1 @ smlatb pc, r1, r2, r3") + TEST_RRR( "smlabt r0, r",1, VAL1,", r",2, VAL2,", r",3, VAL3,"") + TEST_RRR( "smlabtge r7, r",8, VAL3,", r",9, VAL1,", r",10, VAL2,"") + TEST_RR( "smlabt lr, r",1, VAL2,", r",2, VAL3,", r13") + TEST_UNSUPPORTED(".word 0xe10f32c1 @ smlabt pc, r1, r2, r3") + TEST_RRR( "smlatt r0, r",1, VAL1,", r",2, VAL2,", r",3, VAL3,"") + TEST_RRR( "smlattge r7, r",8, VAL3,", r",9, VAL1,", r",10, VAL2,"") + TEST_RR( "smlatt lr, r",1, VAL2,", r",2, VAL3,", r13") + TEST_UNSUPPORTED(".word 0xe10f32e1 @ smlatt pc, r1, r2, r3") + + TEST_RRR( "smlawb r0, r",1, VAL1,", r",2, VAL2,", r",3, VAL3,"") + TEST_RRR( "smlawbge r7, r",8, VAL3,", r",9, VAL1,", r",10, VAL2,"") + TEST_RR( "smlawb lr, r",1, VAL2,", r",2, VAL3,", r13") + TEST_UNSUPPORTED(".word 0xe12f3281 @ smlawb pc, r1, r2, r3") + TEST_RRR( "smlawt r0, r",1, VAL1,", r",2, VAL2,", r",3, VAL3,"") + TEST_RRR( "smlawtge r7, r",8, VAL3,", r",9, VAL1,", r",10, VAL2,"") + TEST_RR( "smlawt lr, r",1, VAL2,", r",2, VAL3,", r13") + TEST_UNSUPPORTED(".word 0xe12f32c1 @ smlawt pc, r1, r2, r3") + TEST_UNSUPPORTED(".word 0xe12032cf @ smlawt r0, pc, r2, r3") + TEST_UNSUPPORTED(".word 0xe1203fc1 @ smlawt r0, r1, pc, r3") + TEST_UNSUPPORTED(".word 0xe120f2c1 @ smlawt r0, r1, r2, pc") + + TEST_RR( "smulwb r0, r",1, VAL1,", r",2, VAL2,"") + TEST_RR( "smulwbge r7, r",8, VAL3,", r",9, VAL1,"") + TEST_R( "smulwb lr, r",1, VAL2,", r13") + TEST_UNSUPPORTED(".word 0xe12f02a1 @ smulwb pc, r1, r2") + TEST_RR( "smulwt r0, r",1, VAL1,", r",2, VAL2,"") + TEST_RR( "smulwtge r7, r",8, VAL3,", r",9, VAL1,"") + TEST_R( "smulwt lr, r",1, VAL2,", r13") + TEST_UNSUPPORTED(".word 0xe12f02e1 @ smulwt pc, r1, r2") + + TEST_RRRR( "smlalbb r",0, VAL1,", r",1, VAL2,", r",2, VAL3,", r",3, VAL4) + TEST_RRRR( "smlalbble r",8, VAL4,", r",9, VAL1,", r",10,VAL2,", r",11,VAL3) + TEST_RRR( "smlalbb r",14,VAL3,", r",7, VAL4,", r",5, VAL1,", r13") + TEST_UNSUPPORTED(".word 0xe14f1382 @ smlalbb pc, r1, r2, r3") + TEST_UNSUPPORTED(".word 0xe141f382 @ smlalbb r1, pc, r2, r3") + TEST_RRRR( "smlaltb r",0, VAL1,", r",1, VAL2,", r",2, VAL3,", r",3, VAL4) + TEST_RRRR( "smlaltble r",8, VAL4,", r",9, VAL1,", r",10,VAL2,", r",11,VAL3) + TEST_RRR( "smlaltb r",14,VAL3,", r",7, VAL4,", r",5, VAL1,", r13") + TEST_UNSUPPORTED(".word 0xe14f13a2 @ smlaltb pc, r1, r2, r3") + TEST_UNSUPPORTED(".word 0xe141f3a2 @ smlaltb r1, pc, r2, r3") + TEST_RRRR( "smlalbt r",0, VAL1,", r",1, VAL2,", r",2, VAL3,", r",3, VAL4) + TEST_RRRR( "smlalbtle r",8, VAL4,", r",9, VAL1,", r",10,VAL2,", r",11,VAL3) + TEST_RRR( "smlalbt r",14,VAL3,", r",7, VAL4,", r",5, VAL1,", r13") + TEST_UNSUPPORTED(".word 0xe14f13c2 @ smlalbt pc, r1, r2, r3") + TEST_UNSUPPORTED(".word 0xe141f3c2 @ smlalbt r1, pc, r2, r3") + TEST_RRRR( "smlaltt r",0, VAL1,", r",1, VAL2,", r",2, VAL3,", r",3, VAL4) + TEST_RRRR( "smlalttle r",8, VAL4,", r",9, VAL1,", r",10,VAL2,", r",11,VAL3) + TEST_RRR( "smlaltt r",14,VAL3,", r",7, VAL4,", r",5, VAL1,", r13") + TEST_UNSUPPORTED(".word 0xe14f13e2 @ smlalbb pc, r1, r2, r3") + TEST_UNSUPPORTED(".word 0xe140f3e2 @ smlalbb r0, pc, r2, r3") + TEST_UNSUPPORTED(".word 0xe14013ef @ smlalbb r0, r1, pc, r3") + TEST_UNSUPPORTED(".word 0xe1401fe2 @ smlalbb r0, r1, r2, pc") + + TEST_RR( "smulbb r0, r",1, VAL1,", r",2, VAL2,"") + TEST_RR( "smulbbge r7, r",8, VAL3,", r",9, VAL1,"") + TEST_R( "smulbb lr, r",1, VAL2,", r13") + TEST_UNSUPPORTED(".word 0xe16f0281 @ smulbb pc, r1, r2") + TEST_RR( "smultb r0, r",1, VAL1,", r",2, VAL2,"") + TEST_RR( "smultbge r7, r",8, VAL3,", r",9, VAL1,"") + TEST_R( "smultb lr, r",1, VAL2,", r13") + TEST_UNSUPPORTED(".word 0xe16f02a1 @ smultb pc, r1, r2") + TEST_RR( "smulbt r0, r",1, VAL1,", r",2, VAL2,"") + TEST_RR( "smulbtge r7, r",8, VAL3,", r",9, VAL1,"") + TEST_R( "smulbt lr, r",1, VAL2,", r13") + TEST_UNSUPPORTED(".word 0xe16f02c1 @ smultb pc, r1, r2") + TEST_RR( "smultt r0, r",1, VAL1,", r",2, VAL2,"") + TEST_RR( "smulttge r7, r",8, VAL3,", r",9, VAL1,"") + TEST_R( "smultt lr, r",1, VAL2,", r13") + TEST_UNSUPPORTED(".word 0xe16f02e1 @ smultt pc, r1, r2") + TEST_UNSUPPORTED(".word 0xe16002ef @ smultt r0, pc, r2") + TEST_UNSUPPORTED(".word 0xe1600fe1 @ smultt r0, r1, pc") + + TEST_GROUP("Multiply and multiply-accumulate") + + TEST_RR( "mul r0, r",1, VAL1,", r",2, VAL2,"") + TEST_RR( "mulls r7, r",8, VAL2,", r",9, VAL2,"") + TEST_R( "mul lr, r",4, VAL3,", r13") + TEST_UNSUPPORTED(".word 0xe00f0291 @ mul pc, r1, r2") + TEST_UNSUPPORTED(".word 0xe000029f @ mul r0, pc, r2") + TEST_UNSUPPORTED(".word 0xe0000f91 @ mul r0, r1, pc") + TEST_RR( "muls r0, r",1, VAL1,", r",2, VAL2,"") + TEST_RR( "mullss r7, r",8, VAL2,", r",9, VAL2,"") + TEST_R( "muls lr, r",4, VAL3,", r13") + TEST_UNSUPPORTED(".word 0xe01f0291 @ muls pc, r1, r2") + + TEST_RRR( "mla r0, r",1, VAL1,", r",2, VAL2,", r",3, VAL3,"") + TEST_RRR( "mlahi r7, r",8, VAL3,", r",9, VAL1,", r",10, VAL2,"") + TEST_RR( "mla lr, r",1, VAL2,", r",2, VAL3,", r13") + TEST_UNSUPPORTED(".word 0xe02f3291 @ mla pc, r1, r2, r3") + TEST_RRR( "mlas r0, r",1, VAL1,", r",2, VAL2,", r",3, VAL3,"") + TEST_RRR( "mlahis r7, r",8, VAL3,", r",9, VAL1,", r",10, VAL2,"") + TEST_RR( "mlas lr, r",1, VAL2,", r",2, VAL3,", r13") + TEST_UNSUPPORTED(".word 0xe03f3291 @ mlas pc, r1, r2, r3") + +#if __LINUX_ARM_ARCH__ >= 6 + TEST_RR( "umaal r0, r1, r",2, VAL1,", r",3, VAL2,"") + TEST_RR( "umaalls r7, r8, r",9, VAL2,", r",10, VAL1,"") + TEST_R( "umaal lr, r12, r",11,VAL3,", r13") + TEST_UNSUPPORTED(".word 0xe041f392 @ umaal pc, r1, r2, r3") + TEST_UNSUPPORTED(".word 0xe04f0392 @ umaal r0, pc, r2, r3") + TEST_UNSUPPORTED(".word 0xe0500090 @ undef") + TEST_UNSUPPORTED(".word 0xe05fff9f @ undef") + + TEST_RRR( "mls r0, r",1, VAL1,", r",2, VAL2,", r",3, VAL3,"") + TEST_RRR( "mlshi r7, r",8, VAL3,", r",9, VAL1,", r",10, VAL2,"") + TEST_RR( "mls lr, r",1, VAL2,", r",2, VAL3,", r13") + TEST_UNSUPPORTED(".word 0xe06f3291 @ mls pc, r1, r2, r3") + TEST_UNSUPPORTED(".word 0xe060329f @ mls r0, pc, r2, r3") + TEST_UNSUPPORTED(".word 0xe0603f91 @ mls r0, r1, pc, r3") + TEST_UNSUPPORTED(".word 0xe060f291 @ mls r0, r1, r2, pc") +#endif + + TEST_UNSUPPORTED(".word 0xe0700090 @ undef") + TEST_UNSUPPORTED(".word 0xe07fff9f @ undef") + + TEST_RR( "umull r0, r1, r",2, VAL1,", r",3, VAL2,"") + TEST_RR( "umullls r7, r8, r",9, VAL2,", r",10, VAL1,"") + TEST_R( "umull lr, r12, r",11,VAL3,", r13") + TEST_UNSUPPORTED(".word 0xe081f392 @ umull pc, r1, r2, r3") + TEST_UNSUPPORTED(".word 0xe08f1392 @ umull r1, pc, r2, r3") + TEST_RR( "umulls r0, r1, r",2, VAL1,", r",3, VAL2,"") + TEST_RR( "umulllss r7, r8, r",9, VAL2,", r",10, VAL1,"") + TEST_R( "umulls lr, r12, r",11,VAL3,", r13") + TEST_UNSUPPORTED(".word 0xe091f392 @ umulls pc, r1, r2, r3") + TEST_UNSUPPORTED(".word 0xe09f1392 @ umulls r1, pc, r2, r3") + + TEST_RRRR( "umlal r",0, VAL1,", r",1, VAL2,", r",2, VAL3,", r",3, VAL4) + TEST_RRRR( "umlalle r",8, VAL4,", r",9, VAL1,", r",10,VAL2,", r",11,VAL3) + TEST_RRR( "umlal r",14,VAL3,", r",7, VAL4,", r",5, VAL1,", r13") + TEST_UNSUPPORTED(".word 0xe0af1392 @ umlal pc, r1, r2, r3") + TEST_UNSUPPORTED(".word 0xe0a1f392 @ umlal r1, pc, r2, r3") + TEST_RRRR( "umlals r",0, VAL1,", r",1, VAL2,", r",2, VAL3,", r",3, VAL4) + TEST_RRRR( "umlalles r",8, VAL4,", r",9, VAL1,", r",10,VAL2,", r",11,VAL3) + TEST_RRR( "umlals r",14,VAL3,", r",7, VAL4,", r",5, VAL1,", r13") + TEST_UNSUPPORTED(".word 0xe0bf1392 @ umlals pc, r1, r2, r3") + TEST_UNSUPPORTED(".word 0xe0b1f392 @ umlals r1, pc, r2, r3") + + TEST_RR( "smull r0, r1, r",2, VAL1,", r",3, VAL2,"") + TEST_RR( "smullls r7, r8, r",9, VAL2,", r",10, VAL1,"") + TEST_R( "smull lr, r12, r",11,VAL3,", r13") + TEST_UNSUPPORTED(".word 0xe0c1f392 @ smull pc, r1, r2, r3") + TEST_UNSUPPORTED(".word 0xe0cf1392 @ smull r1, pc, r2, r3") + TEST_RR( "smulls r0, r1, r",2, VAL1,", r",3, VAL2,"") + TEST_RR( "smulllss r7, r8, r",9, VAL2,", r",10, VAL1,"") + TEST_R( "smulls lr, r12, r",11,VAL3,", r13") + TEST_UNSUPPORTED(".word 0xe0d1f392 @ smulls pc, r1, r2, r3") + TEST_UNSUPPORTED(".word 0xe0df1392 @ smulls r1, pc, r2, r3") + + TEST_RRRR( "smlal r",0, VAL1,", r",1, VAL2,", r",2, VAL3,", r",3, VAL4) + TEST_RRRR( "smlalle r",8, VAL4,", r",9, VAL1,", r",10,VAL2,", r",11,VAL3) + TEST_RRR( "smlal r",14,VAL3,", r",7, VAL4,", r",5, VAL1,", r13") + TEST_UNSUPPORTED(".word 0xe0ef1392 @ smlal pc, r1, r2, r3") + TEST_UNSUPPORTED(".word 0xe0e1f392 @ smlal r1, pc, r2, r3") + TEST_RRRR( "smlals r",0, VAL1,", r",1, VAL2,", r",2, VAL3,", r",3, VAL4) + TEST_RRRR( "smlalles r",8, VAL4,", r",9, VAL1,", r",10,VAL2,", r",11,VAL3) + TEST_RRR( "smlals r",14,VAL3,", r",7, VAL4,", r",5, VAL1,", r13") + TEST_UNSUPPORTED(".word 0xe0ff1392 @ smlals pc, r1, r2, r3") + TEST_UNSUPPORTED(".word 0xe0f0f392 @ smlals r0, pc, r2, r3") + TEST_UNSUPPORTED(".word 0xe0f0139f @ smlals r0, r1, pc, r3") + TEST_UNSUPPORTED(".word 0xe0f01f92 @ smlals r0, r1, r2, pc") + + TEST_GROUP("Synchronization primitives") + + /* + * Use hard coded constants for SWP instructions to avoid warnings + * about deprecated instructions. + */ + TEST_RP( ".word 0xe108e097 @ swp lr, r",7,VAL2,", [r",8,0,"]") + TEST_R( ".word 0x610d0091 @ swpvs r0, r",1,VAL1,", [sp]") + TEST_RP( ".word 0xe10cd09e @ swp sp, r",14,VAL2,", [r",12,13*4,"]") + TEST_UNSUPPORTED(".word 0xe102f091 @ swp pc, r1, [r2]") + TEST_UNSUPPORTED(".word 0xe102009f @ swp r0, pc, [r2]") + TEST_UNSUPPORTED(".word 0xe10f0091 @ swp r0, r1, [pc]") + TEST_RP( ".word 0xe148e097 @ swpb lr, r",7,VAL2,", [r",8,0,"]") + TEST_R( ".word 0x614d0091 @ swpvsb r0, r",1,VAL1,", [sp]") + TEST_UNSUPPORTED(".word 0xe142f091 @ swpb pc, r1, [r2]") + + TEST_UNSUPPORTED(".word 0xe1100090") /* Unallocated space */ + TEST_UNSUPPORTED(".word 0xe1200090") /* Unallocated space */ + TEST_UNSUPPORTED(".word 0xe1300090") /* Unallocated space */ + TEST_UNSUPPORTED(".word 0xe1500090") /* Unallocated space */ + TEST_UNSUPPORTED(".word 0xe1600090") /* Unallocated space */ + TEST_UNSUPPORTED(".word 0xe1700090") /* Unallocated space */ +#if __LINUX_ARM_ARCH__ >= 6 + TEST_UNSUPPORTED("ldrex r2, [sp]") + TEST_UNSUPPORTED("strexd r0, r2, r3, [sp]") + TEST_UNSUPPORTED("ldrexd r2, r3, [sp]") + TEST_UNSUPPORTED("strexb r0, r2, [sp]") + TEST_UNSUPPORTED("ldrexb r2, [sp]") + TEST_UNSUPPORTED("strexh r0, r2, [sp]") + TEST_UNSUPPORTED("ldrexh r2, [sp]") +#endif + TEST_GROUP("Extra load/store instructions") + + TEST_RPR( "strh r",0, VAL1,", [r",1, 48,", -r",2, 24,"]") + TEST_RPR( "streqh r",14,VAL2,", [r",13,0, ", r",12, 48,"]") + TEST_RPR( "strh r",1, VAL1,", [r",2, 24,", r",3, 48,"]!") + TEST_RPR( "strneh r",12,VAL2,", [r",11,48,", -r",10,24,"]!") + TEST_RPR( "strh r",2, VAL1,", [r",3, 24,"], r",4, 48,"") + TEST_RPR( "strh r",10,VAL2,", [r",9, 48,"], -r",11,24,"") + TEST_UNSUPPORTED(".word 0xe1afc0ba @ strh r12, [pc, r10]!") + TEST_UNSUPPORTED(".word 0xe089f0bb @ strh pc, [r9], r11") + TEST_UNSUPPORTED(".word 0xe089a0bf @ strh r10, [r9], pc") + + TEST_PR( "ldrh r0, [r",0, 48,", -r",2, 24,"]") + TEST_PR( "ldrcsh r14, [r",13,0, ", r",12, 48,"]") + TEST_PR( "ldrh r1, [r",2, 24,", r",3, 48,"]!") + TEST_PR( "ldrcch r12, [r",11,48,", -r",10,24,"]!") + TEST_PR( "ldrh r2, [r",3, 24,"], r",4, 48,"") + TEST_PR( "ldrh r10, [r",9, 48,"], -r",11,24,"") + TEST_UNSUPPORTED(".word 0xe1bfc0ba @ ldrh r12, [pc, r10]!") + TEST_UNSUPPORTED(".word 0xe099f0bb @ ldrh pc, [r9], r11") + TEST_UNSUPPORTED(".word 0xe099a0bf @ ldrh r10, [r9], pc") + + TEST_RP( "strh r",0, VAL1,", [r",1, 24,", #-2]") + TEST_RP( "strmih r",14,VAL2,", [r",13,0, ", #2]") + TEST_RP( "strh r",1, VAL1,", [r",2, 24,", #4]!") + TEST_RP( "strplh r",12,VAL2,", [r",11,24,", #-4]!") + TEST_RP( "strh r",2, VAL1,", [r",3, 24,"], #48") + TEST_RP( "strh r",10,VAL2,", [r",9, 64,"], #-48") + TEST_UNSUPPORTED(".word 0xe1efc3b0 @ strh r12, [pc, #48]!") + TEST_UNSUPPORTED(".word 0xe0c9f3b0 @ strh pc, [r9], #48") + + TEST_P( "ldrh r0, [r",0, 24,", #-2]") + TEST_P( "ldrvsh r14, [r",13,0, ", #2]") + TEST_P( "ldrh r1, [r",2, 24,", #4]!") + TEST_P( "ldrvch r12, [r",11,24,", #-4]!") + TEST_P( "ldrh r2, [r",3, 24,"], #48") + TEST_P( "ldrh r10, [r",9, 64,"], #-48") + TEST( "ldrh r0, [pc, #0]") + TEST_UNSUPPORTED(".word 0xe1ffc3b0 @ ldrh r12, [pc, #48]!") + TEST_UNSUPPORTED(".word 0xe0d9f3b0 @ ldrh pc, [r9], #48") + + TEST_PR( "ldrsb r0, [r",0, 48,", -r",2, 24,"]") + TEST_PR( "ldrhisb r14, [r",13,0,", r",12, 48,"]") + TEST_PR( "ldrsb r1, [r",2, 24,", r",3, 48,"]!") + TEST_PR( "ldrlssb r12, [r",11,48,", -r",10,24,"]!") + TEST_PR( "ldrsb r2, [r",3, 24,"], r",4, 48,"") + TEST_PR( "ldrsb r10, [r",9, 48,"], -r",11,24,"") + TEST_UNSUPPORTED(".word 0xe1bfc0da @ ldrsb r12, [pc, r10]!") + TEST_UNSUPPORTED(".word 0xe099f0db @ ldrsb pc, [r9], r11") + + TEST_P( "ldrsb r0, [r",0, 24,", #-1]") + TEST_P( "ldrgesb r14, [r",13,0, ", #1]") + TEST_P( "ldrsb r1, [r",2, 24,", #4]!") + TEST_P( "ldrltsb r12, [r",11,24,", #-4]!") + TEST_P( "ldrsb r2, [r",3, 24,"], #48") + TEST_P( "ldrsb r10, [r",9, 64,"], #-48") + TEST( "ldrsb r0, [pc, #0]") + TEST_UNSUPPORTED(".word 0xe1ffc3d0 @ ldrsb r12, [pc, #48]!") + TEST_UNSUPPORTED(".word 0xe0d9f3d0 @ ldrsb pc, [r9], #48") + + TEST_PR( "ldrsh r0, [r",0, 48,", -r",2, 24,"]") + TEST_PR( "ldrgtsh r14, [r",13,0, ", r",12, 48,"]") + TEST_PR( "ldrsh r1, [r",2, 24,", r",3, 48,"]!") + TEST_PR( "ldrlesh r12, [r",11,48,", -r",10,24,"]!") + TEST_PR( "ldrsh r2, [r",3, 24,"], r",4, 48,"") + TEST_PR( "ldrsh r10, [r",9, 48,"], -r",11,24,"") + TEST_UNSUPPORTED(".word 0xe1bfc0fa @ ldrsh r12, [pc, r10]!") + TEST_UNSUPPORTED(".word 0xe099f0fb @ ldrsh pc, [r9], r11") + + TEST_P( "ldrsh r0, [r",0, 24,", #-1]") + TEST_P( "ldreqsh r14, [r",13,0 ,", #1]") + TEST_P( "ldrsh r1, [r",2, 24,", #4]!") + TEST_P( "ldrnesh r12, [r",11,24,", #-4]!") + TEST_P( "ldrsh r2, [r",3, 24,"], #48") + TEST_P( "ldrsh r10, [r",9, 64,"], #-48") + TEST( "ldrsh r0, [pc, #0]") + TEST_UNSUPPORTED(".word 0xe1ffc3f0 @ ldrsh r12, [pc, #48]!") + TEST_UNSUPPORTED(".word 0xe0d9f3f0 @ ldrsh pc, [r9], #48") + +#if __LINUX_ARM_ARCH__ >= 7 + TEST_UNSUPPORTED("strht r1, [r2], r3") + TEST_UNSUPPORTED("ldrht r1, [r2], r3") + TEST_UNSUPPORTED("strht r1, [r2], #48") + TEST_UNSUPPORTED("ldrht r1, [r2], #48") + TEST_UNSUPPORTED("ldrsbt r1, [r2], r3") + TEST_UNSUPPORTED("ldrsbt r1, [r2], #48") + TEST_UNSUPPORTED("ldrsht r1, [r2], r3") + TEST_UNSUPPORTED("ldrsht r1, [r2], #48") +#endif + + TEST_RPR( "strd r",0, VAL1,", [r",1, 48,", -r",2,24,"]") + TEST_RPR( "strccd r",8, VAL2,", [r",13,0, ", r",12,48,"]") + TEST_RPR( "strd r",4, VAL1,", [r",2, 24,", r",3, 48,"]!") + TEST_RPR( "strcsd r",12,VAL2,", [r",11,48,", -r",10,24,"]!") + TEST_RPR( "strd r",2, VAL1,", [r",3, 24,"], r",4,48,"") + TEST_RPR( "strd r",10,VAL2,", [r",9, 48,"], -r",7,24,"") + TEST_UNSUPPORTED(".word 0xe1afc0fa @ strd r12, [pc, r10]!") + + TEST_PR( "ldrd r0, [r",0, 48,", -r",2,24,"]") + TEST_PR( "ldrmid r8, [r",13,0, ", r",12,48,"]") + TEST_PR( "ldrd r4, [r",2, 24,", r",3, 48,"]!") + TEST_PR( "ldrpld r6, [r",11,48,", -r",10,24,"]!") + TEST_PR( "ldrd r2, [r",5, 24,"], r",4,48,"") + TEST_PR( "ldrd r10, [r",9,48,"], -r",7,24,"") + TEST_UNSUPPORTED(".word 0xe1afc0da @ ldrd r12, [pc, r10]!") + TEST_UNSUPPORTED(".word 0xe089f0db @ ldrd pc, [r9], r11") + TEST_UNSUPPORTED(".word 0xe089e0db @ ldrd lr, [r9], r11") + TEST_UNSUPPORTED(".word 0xe089c0df @ ldrd r12, [r9], pc") + + TEST_RP( "strd r",0, VAL1,", [r",1, 24,", #-8]") + TEST_RP( "strvsd r",8, VAL2,", [r",13,0, ", #8]") + TEST_RP( "strd r",4, VAL1,", [r",2, 24,", #16]!") + TEST_RP( "strvcd r",12,VAL2,", [r",11,24,", #-16]!") + TEST_RP( "strd r",2, VAL1,", [r",4, 24,"], #48") + TEST_RP( "strd r",10,VAL2,", [r",9, 64,"], #-48") + TEST_UNSUPPORTED(".word 0xe1efc3f0 @ strd r12, [pc, #48]!") + + TEST_P( "ldrd r0, [r",0, 24,", #-8]") + TEST_P( "ldrhid r8, [r",13,0, ", #8]") + TEST_P( "ldrd r4, [r",2, 24,", #16]!") + TEST_P( "ldrlsd r6, [r",11,24,", #-16]!") + TEST_P( "ldrd r2, [r",5, 24,"], #48") + TEST_P( "ldrd r10, [r",9,6,"], #-48") + TEST_UNSUPPORTED(".word 0xe1efc3d0 @ ldrd r12, [pc, #48]!") + TEST_UNSUPPORTED(".word 0xe0c9f3d0 @ ldrd pc, [r9], #48") + TEST_UNSUPPORTED(".word 0xe0c9e3d0 @ ldrd lr, [r9], #48") + + TEST_GROUP("Miscellaneous") + +#if __LINUX_ARM_ARCH__ >= 7 + TEST("movw r0, #0") + TEST("movw r0, #0xffff") + TEST("movw lr, #0xffff") + TEST_UNSUPPORTED(".word 0xe300f000 @ movw pc, #0") + TEST_R("movt r",0, VAL1,", #0") + TEST_R("movt r",0, VAL2,", #0xffff") + TEST_R("movt r",14,VAL1,", #0xffff") + TEST_UNSUPPORTED(".word 0xe340f000 @ movt pc, #0") +#endif + + TEST_UNSUPPORTED("msr cpsr, 0x13") + TEST_UNSUPPORTED("msr cpsr_f, 0xf0000000") + TEST_UNSUPPORTED("msr spsr, 0x13") + +#if __LINUX_ARM_ARCH__ >= 7 + TEST_SUPPORTED("yield") + TEST("sev") + TEST("nop") + TEST("wfi") + TEST_SUPPORTED("wfe") + TEST_UNSUPPORTED("dbg #0") +#endif + + TEST_GROUP("Load/store word and unsigned byte") + +#define LOAD_STORE(byte) \ + TEST_RP( "str"byte" r",0, VAL1,", [r",1, 24,", #-2]") \ + TEST_RP( "str"byte" r",14,VAL2,", [r",13,0, ", #2]") \ + TEST_RP( "str"byte" r",1, VAL1,", [r",2, 24,", #4]!") \ + TEST_RP( "str"byte" r",12,VAL2,", [r",11,24,", #-4]!") \ + TEST_RP( "str"byte" r",2, VAL1,", [r",3, 24,"], #48") \ + TEST_RP( "str"byte" r",10,VAL2,", [r",9, 64,"], #-48") \ + TEST_RPR("str"byte" r",0, VAL1,", [r",1, 48,", -r",2, 24,"]") \ + TEST_RPR("str"byte" r",14,VAL2,", [r",13,0, ", r",12, 48,"]") \ + TEST_RPR("str"byte" r",1, VAL1,", [r",2, 24,", r",3, 48,"]!") \ + TEST_RPR("str"byte" r",12,VAL2,", [r",11,48,", -r",10,24,"]!") \ + TEST_RPR("str"byte" r",2, VAL1,", [r",3, 24,"], r",4, 48,"") \ + TEST_RPR("str"byte" r",10,VAL2,", [r",9, 48,"], -r",11,24,"") \ + TEST_RPR("str"byte" r",0, VAL1,", [r",1, 24,", r",2, 32,", asl #1]")\ + TEST_RPR("str"byte" r",14,VAL2,", [r",13,0, ", r",12, 32,", lsr #2]")\ + TEST_RPR("str"byte" r",1, VAL1,", [r",2, 24,", r",3, 32,", asr #3]!")\ + TEST_RPR("str"byte" r",12,VAL2,", [r",11,24,", r",10, 4,", ror #31]!")\ + TEST_P( "ldr"byte" r0, [r",0, 24,", #-2]") \ + TEST_P( "ldr"byte" r14, [r",13,0, ", #2]") \ + TEST_P( "ldr"byte" r1, [r",2, 24,", #4]!") \ + TEST_P( "ldr"byte" r12, [r",11,24,", #-4]!") \ + TEST_P( "ldr"byte" r2, [r",3, 24,"], #48") \ + TEST_P( "ldr"byte" r10, [r",9, 64,"], #-48") \ + TEST_PR( "ldr"byte" r0, [r",0, 48,", -r",2, 24,"]") \ + TEST_PR( "ldr"byte" r14, [r",13,0, ", r",12, 48,"]") \ + TEST_PR( "ldr"byte" r1, [r",2, 24,", r",3, 48,"]!") \ + TEST_PR( "ldr"byte" r12, [r",11,48,", -r",10,24,"]!") \ + TEST_PR( "ldr"byte" r2, [r",3, 24,"], r",4, 48,"") \ + TEST_PR( "ldr"byte" r10, [r",9, 48,"], -r",11,24,"") \ + TEST_PR( "ldr"byte" r0, [r",0, 24,", r",2, 32,", asl #1]") \ + TEST_PR( "ldr"byte" r14, [r",13,0, ", r",12, 32,", lsr #2]") \ + TEST_PR( "ldr"byte" r1, [r",2, 24,", r",3, 32,", asr #3]!") \ + TEST_PR( "ldr"byte" r12, [r",11,24,", r",10, 4,", ror #31]!") \ + TEST( "ldr"byte" r0, [pc, #0]") \ + TEST_R( "ldr"byte" r12, [pc, r",14,0,"]") + + LOAD_STORE("") + TEST_P( "str pc, [r",0,0,", #15*4]") + TEST_R( "str pc, [sp, r",2,15*4,"]") + TEST_BF( "ldr pc, [sp, #15*4]") + TEST_BF_R("ldr pc, [sp, r",2,15*4,"]") + + TEST_P( "str sp, [r",0,0,", #13*4]") + TEST_R( "str sp, [sp, r",2,13*4,"]") + TEST_BF( "ldr sp, [sp, #13*4]") + TEST_BF_R("ldr sp, [sp, r",2,13*4,"]") + +#ifdef CONFIG_THUMB2_KERNEL + TEST_ARM_TO_THUMB_INTERWORK_P("ldr pc, [r",0,0,", #15*4]") +#endif + TEST_UNSUPPORTED(".word 0xe5af6008 @ str r6, [pc, #8]!") + TEST_UNSUPPORTED(".word 0xe7af6008 @ str r6, [pc, r8]!") + TEST_UNSUPPORTED(".word 0xe5bf6008 @ ldr r6, [pc, #8]!") + TEST_UNSUPPORTED(".word 0xe7bf6008 @ ldr r6, [pc, r8]!") + TEST_UNSUPPORTED(".word 0xe788600f @ str r6, [r8, pc]") + TEST_UNSUPPORTED(".word 0xe798600f @ ldr r6, [r8, pc]") + + LOAD_STORE("b") + TEST_UNSUPPORTED(".word 0xe5f7f008 @ ldrb pc, [r7, #8]!") + TEST_UNSUPPORTED(".word 0xe7f7f008 @ ldrb pc, [r7, r8]!") + TEST_UNSUPPORTED(".word 0xe5ef6008 @ strb r6, [pc, #8]!") + TEST_UNSUPPORTED(".word 0xe7ef6008 @ strb r6, [pc, r3]!") + TEST_UNSUPPORTED(".word 0xe5ff6008 @ ldrb r6, [pc, #8]!") + TEST_UNSUPPORTED(".word 0xe7ff6008 @ ldrb r6, [pc, r3]!") + + TEST_UNSUPPORTED("ldrt r0, [r1], #4") + TEST_UNSUPPORTED("ldrt r1, [r2], r3") + TEST_UNSUPPORTED("strt r2, [r3], #4") + TEST_UNSUPPORTED("strt r3, [r4], r5") + TEST_UNSUPPORTED("ldrbt r4, [r5], #4") + TEST_UNSUPPORTED("ldrbt r5, [r6], r7") + TEST_UNSUPPORTED("strbt r6, [r7], #4") + TEST_UNSUPPORTED("strbt r7, [r8], r9") + +#if __LINUX_ARM_ARCH__ >= 7 + TEST_GROUP("Parallel addition and subtraction, signed") + + TEST_UNSUPPORTED(".word 0xe6000010") /* Unallocated space */ + TEST_UNSUPPORTED(".word 0xe60fffff") /* Unallocated space */ + + TEST_RR( "sadd16 r0, r",0, HH1,", r",1, HH2,"") + TEST_RR( "sadd16 r14, r",12,HH2,", r",10,HH1,"") + TEST_UNSUPPORTED(".word 0xe61cff1a @ sadd16 pc, r12, r10") + TEST_RR( "sasx r0, r",0, HH1,", r",1, HH2,"") + TEST_RR( "sasx r14, r",12,HH2,", r",10,HH1,"") + TEST_UNSUPPORTED(".word 0xe61cff3a @ sasx pc, r12, r10") + TEST_RR( "ssax r0, r",0, HH1,", r",1, HH2,"") + TEST_RR( "ssax r14, r",12,HH2,", r",10,HH1,"") + TEST_UNSUPPORTED(".word 0xe61cff5a @ ssax pc, r12, r10") + TEST_RR( "ssub16 r0, r",0, HH1,", r",1, HH2,"") + TEST_RR( "ssub16 r14, r",12,HH2,", r",10,HH1,"") + TEST_UNSUPPORTED(".word 0xe61cff7a @ ssub16 pc, r12, r10") + TEST_RR( "sadd8 r0, r",0, HH1,", r",1, HH2,"") + TEST_RR( "sadd8 r14, r",12,HH2,", r",10,HH1,"") + TEST_UNSUPPORTED(".word 0xe61cff9a @ sadd8 pc, r12, r10") + TEST_UNSUPPORTED(".word 0xe61000b0") /* Unallocated space */ + TEST_UNSUPPORTED(".word 0xe61fffbf") /* Unallocated space */ + TEST_UNSUPPORTED(".word 0xe61000d0") /* Unallocated space */ + TEST_UNSUPPORTED(".word 0xe61fffdf") /* Unallocated space */ + TEST_RR( "ssub8 r0, r",0, HH1,", r",1, HH2,"") + TEST_RR( "ssub8 r14, r",12,HH2,", r",10,HH1,"") + TEST_UNSUPPORTED(".word 0xe61cfffa @ ssub8 pc, r12, r10") + + TEST_RR( "qadd16 r0, r",0, HH1,", r",1, HH2,"") + TEST_RR( "qadd16 r14, r",12,HH2,", r",10,HH1,"") + TEST_UNSUPPORTED(".word 0xe62cff1a @ qadd16 pc, r12, r10") + TEST_RR( "qasx r0, r",0, HH1,", r",1, HH2,"") + TEST_RR( "qasx r14, r",12,HH2,", r",10,HH1,"") + TEST_UNSUPPORTED(".word 0xe62cff3a @ qasx pc, r12, r10") + TEST_RR( "qsax r0, r",0, HH1,", r",1, HH2,"") + TEST_RR( "qsax r14, r",12,HH2,", r",10,HH1,"") + TEST_UNSUPPORTED(".word 0xe62cff5a @ qsax pc, r12, r10") + TEST_RR( "qsub16 r0, r",0, HH1,", r",1, HH2,"") + TEST_RR( "qsub16 r14, r",12,HH2,", r",10,HH1,"") + TEST_UNSUPPORTED(".word 0xe62cff7a @ qsub16 pc, r12, r10") + TEST_RR( "qadd8 r0, r",0, HH1,", r",1, HH2,"") + TEST_RR( "qadd8 r14, r",12,HH2,", r",10,HH1,"") + TEST_UNSUPPORTED(".word 0xe62cff9a @ qadd8 pc, r12, r10") + TEST_UNSUPPORTED(".word 0xe62000b0") /* Unallocated space */ + TEST_UNSUPPORTED(".word 0xe62fffbf") /* Unallocated space */ + TEST_UNSUPPORTED(".word 0xe62000d0") /* Unallocated space */ + TEST_UNSUPPORTED(".word 0xe62fffdf") /* Unallocated space */ + TEST_RR( "qsub8 r0, r",0, HH1,", r",1, HH2,"") + TEST_RR( "qsub8 r14, r",12,HH2,", r",10,HH1,"") + TEST_UNSUPPORTED(".word 0xe62cfffa @ qsub8 pc, r12, r10") + + TEST_RR( "shadd16 r0, r",0, HH1,", r",1, HH2,"") + TEST_RR( "shadd16 r14, r",12,HH2,", r",10,HH1,"") + TEST_UNSUPPORTED(".word 0xe63cff1a @ shadd16 pc, r12, r10") + TEST_RR( "shasx r0, r",0, HH1,", r",1, HH2,"") + TEST_RR( "shasx r14, r",12,HH2,", r",10,HH1,"") + TEST_UNSUPPORTED(".word 0xe63cff3a @ shasx pc, r12, r10") + TEST_RR( "shsax r0, r",0, HH1,", r",1, HH2,"") + TEST_RR( "shsax r14, r",12,HH2,", r",10,HH1,"") + TEST_UNSUPPORTED(".word 0xe63cff5a @ shsax pc, r12, r10") + TEST_RR( "shsub16 r0, r",0, HH1,", r",1, HH2,"") + TEST_RR( "shsub16 r14, r",12,HH2,", r",10,HH1,"") + TEST_UNSUPPORTED(".word 0xe63cff7a @ shsub16 pc, r12, r10") + TEST_RR( "shadd8 r0, r",0, HH1,", r",1, HH2,"") + TEST_RR( "shadd8 r14, r",12,HH2,", r",10,HH1,"") + TEST_UNSUPPORTED(".word 0xe63cff9a @ shadd8 pc, r12, r10") + TEST_UNSUPPORTED(".word 0xe63000b0") /* Unallocated space */ + TEST_UNSUPPORTED(".word 0xe63fffbf") /* Unallocated space */ + TEST_UNSUPPORTED(".word 0xe63000d0") /* Unallocated space */ + TEST_UNSUPPORTED(".word 0xe63fffdf") /* Unallocated space */ + TEST_RR( "shsub8 r0, r",0, HH1,", r",1, HH2,"") + TEST_RR( "shsub8 r14, r",12,HH2,", r",10,HH1,"") + TEST_UNSUPPORTED(".word 0xe63cfffa @ shsub8 pc, r12, r10") + + TEST_GROUP("Parallel addition and subtraction, unsigned") + + TEST_UNSUPPORTED(".word 0xe6400010") /* Unallocated space */ + TEST_UNSUPPORTED(".word 0xe64fffff") /* Unallocated space */ + + TEST_RR( "uadd16 r0, r",0, HH1,", r",1, HH2,"") + TEST_RR( "uadd16 r14, r",12,HH2,", r",10,HH1,"") + TEST_UNSUPPORTED(".word 0xe65cff1a @ uadd16 pc, r12, r10") + TEST_RR( "uasx r0, r",0, HH1,", r",1, HH2,"") + TEST_RR( "uasx r14, r",12,HH2,", r",10,HH1,"") + TEST_UNSUPPORTED(".word 0xe65cff3a @ uasx pc, r12, r10") + TEST_RR( "usax r0, r",0, HH1,", r",1, HH2,"") + TEST_RR( "usax r14, r",12,HH2,", r",10,HH1,"") + TEST_UNSUPPORTED(".word 0xe65cff5a @ usax pc, r12, r10") + TEST_RR( "usub16 r0, r",0, HH1,", r",1, HH2,"") + TEST_RR( "usub16 r14, r",12,HH2,", r",10,HH1,"") + TEST_UNSUPPORTED(".word 0xe65cff7a @ usub16 pc, r12, r10") + TEST_RR( "uadd8 r0, r",0, HH1,", r",1, HH2,"") + TEST_RR( "uadd8 r14, r",12,HH2,", r",10,HH1,"") + TEST_UNSUPPORTED(".word 0xe65cff9a @ uadd8 pc, r12, r10") + TEST_UNSUPPORTED(".word 0xe65000b0") /* Unallocated space */ + TEST_UNSUPPORTED(".word 0xe65fffbf") /* Unallocated space */ + TEST_UNSUPPORTED(".word 0xe65000d0") /* Unallocated space */ + TEST_UNSUPPORTED(".word 0xe65fffdf") /* Unallocated space */ + TEST_RR( "usub8 r0, r",0, HH1,", r",1, HH2,"") + TEST_RR( "usub8 r14, r",12,HH2,", r",10,HH1,"") + TEST_UNSUPPORTED(".word 0xe65cfffa @ usub8 pc, r12, r10") + + TEST_RR( "uqadd16 r0, r",0, HH1,", r",1, HH2,"") + TEST_RR( "uqadd16 r14, r",12,HH2,", r",10,HH1,"") + TEST_UNSUPPORTED(".word 0xe66cff1a @ uqadd16 pc, r12, r10") + TEST_RR( "uqasx r0, r",0, HH1,", r",1, HH2,"") + TEST_RR( "uqasx r14, r",12,HH2,", r",10,HH1,"") + TEST_UNSUPPORTED(".word 0xe66cff3a @ uqasx pc, r12, r10") + TEST_RR( "uqsax r0, r",0, HH1,", r",1, HH2,"") + TEST_RR( "uqsax r14, r",12,HH2,", r",10,HH1,"") + TEST_UNSUPPORTED(".word 0xe66cff5a @ uqsax pc, r12, r10") + TEST_RR( "uqsub16 r0, r",0, HH1,", r",1, HH2,"") + TEST_RR( "uqsub16 r14, r",12,HH2,", r",10,HH1,"") + TEST_UNSUPPORTED(".word 0xe66cff7a @ uqsub16 pc, r12, r10") + TEST_RR( "uqadd8 r0, r",0, HH1,", r",1, HH2,"") + TEST_RR( "uqadd8 r14, r",12,HH2,", r",10,HH1,"") + TEST_UNSUPPORTED(".word 0xe66cff9a @ uqadd8 pc, r12, r10") + TEST_UNSUPPORTED(".word 0xe66000b0") /* Unallocated space */ + TEST_UNSUPPORTED(".word 0xe66fffbf") /* Unallocated space */ + TEST_UNSUPPORTED(".word 0xe66000d0") /* Unallocated space */ + TEST_UNSUPPORTED(".word 0xe66fffdf") /* Unallocated space */ + TEST_RR( "uqsub8 r0, r",0, HH1,", r",1, HH2,"") + TEST_RR( "uqsub8 r14, r",12,HH2,", r",10,HH1,"") + TEST_UNSUPPORTED(".word 0xe66cfffa @ uqsub8 pc, r12, r10") + + TEST_RR( "uhadd16 r0, r",0, HH1,", r",1, HH2,"") + TEST_RR( "uhadd16 r14, r",12,HH2,", r",10,HH1,"") + TEST_UNSUPPORTED(".word 0xe67cff1a @ uhadd16 pc, r12, r10") + TEST_RR( "uhasx r0, r",0, HH1,", r",1, HH2,"") + TEST_RR( "uhasx r14, r",12,HH2,", r",10,HH1,"") + TEST_UNSUPPORTED(".word 0xe67cff3a @ uhasx pc, r12, r10") + TEST_RR( "uhsax r0, r",0, HH1,", r",1, HH2,"") + TEST_RR( "uhsax r14, r",12,HH2,", r",10,HH1,"") + TEST_UNSUPPORTED(".word 0xe67cff5a @ uhsax pc, r12, r10") + TEST_RR( "uhsub16 r0, r",0, HH1,", r",1, HH2,"") + TEST_RR( "uhsub16 r14, r",12,HH2,", r",10,HH1,"") + TEST_UNSUPPORTED(".word 0xe67cff7a @ uhsub16 pc, r12, r10") + TEST_RR( "uhadd8 r0, r",0, HH1,", r",1, HH2,"") + TEST_RR( "uhadd8 r14, r",12,HH2,", r",10,HH1,"") + TEST_UNSUPPORTED(".word 0xe67cff9a @ uhadd8 pc, r12, r10") + TEST_UNSUPPORTED(".word 0xe67000b0") /* Unallocated space */ + TEST_UNSUPPORTED(".word 0xe67fffbf") /* Unallocated space */ + TEST_UNSUPPORTED(".word 0xe67000d0") /* Unallocated space */ + TEST_UNSUPPORTED(".word 0xe67fffdf") /* Unallocated space */ + TEST_RR( "uhsub8 r0, r",0, HH1,", r",1, HH2,"") + TEST_RR( "uhsub8 r14, r",12,HH2,", r",10,HH1,"") + TEST_UNSUPPORTED(".word 0xe67cfffa @ uhsub8 pc, r12, r10") + TEST_UNSUPPORTED(".word 0xe67feffa @ uhsub8 r14, pc, r10") + TEST_UNSUPPORTED(".word 0xe67cefff @ uhsub8 r14, r12, pc") +#endif /* __LINUX_ARM_ARCH__ >= 7 */ + +#if __LINUX_ARM_ARCH__ >= 6 + TEST_GROUP("Packing, unpacking, saturation, and reversal") + + TEST_RR( "pkhbt r0, r",0, HH1,", r",1, HH2,"") + TEST_RR( "pkhbt r14,r",12, HH1,", r",10,HH2,", lsl #2") + TEST_UNSUPPORTED(".word 0xe68cf11a @ pkhbt pc, r12, r10, lsl #2") + TEST_RR( "pkhtb r0, r",0, HH1,", r",1, HH2,"") + TEST_RR( "pkhtb r14,r",12, HH1,", r",10,HH2,", asr #2") + TEST_UNSUPPORTED(".word 0xe68cf15a @ pkhtb pc, r12, r10, asr #2") + TEST_UNSUPPORTED(".word 0xe68fe15a @ pkhtb r14, pc, r10, asr #2") + TEST_UNSUPPORTED(".word 0xe68ce15f @ pkhtb r14, r12, pc, asr #2") + TEST_UNSUPPORTED(".word 0xe6900010") /* Unallocated space */ + TEST_UNSUPPORTED(".word 0xe69fffdf") /* Unallocated space */ + + TEST_R( "ssat r0, #24, r",0, VAL1,"") + TEST_R( "ssat r14, #24, r",12, VAL2,"") + TEST_R( "ssat r0, #24, r",0, VAL1,", lsl #8") + TEST_R( "ssat r14, #24, r",12, VAL2,", asr #8") + TEST_UNSUPPORTED(".word 0xe6b7f01c @ ssat pc, #24, r12") + + TEST_R( "usat r0, #24, r",0, VAL1,"") + TEST_R( "usat r14, #24, r",12, VAL2,"") + TEST_R( "usat r0, #24, r",0, VAL1,", lsl #8") + TEST_R( "usat r14, #24, r",12, VAL2,", asr #8") + TEST_UNSUPPORTED(".word 0xe6f7f01c @ usat pc, #24, r12") + + TEST_RR( "sxtab16 r0, r",0, HH1,", r",1, HH2,"") + TEST_RR( "sxtab16 r14,r",12, HH2,", r",10,HH1,", ror #8") + TEST_R( "sxtb16 r8, r",7, HH1,"") + TEST_UNSUPPORTED(".word 0xe68cf47a @ sxtab16 pc,r12, r10, ror #8") + + TEST_RR( "sel r0, r",0, VAL1,", r",1, VAL2,"") + TEST_RR( "sel r14, r",12,VAL1,", r",10, VAL2,"") + TEST_UNSUPPORTED(".word 0xe68cffba @ sel pc, r12, r10") + TEST_UNSUPPORTED(".word 0xe68fefba @ sel r14, pc, r10") + TEST_UNSUPPORTED(".word 0xe68cefbf @ sel r14, r12, pc") + + TEST_R( "ssat16 r0, #12, r",0, HH1,"") + TEST_R( "ssat16 r14, #12, r",12, HH2,"") + TEST_UNSUPPORTED(".word 0xe6abff3c @ ssat16 pc, #12, r12") + + TEST_RR( "sxtab r0, r",0, HH1,", r",1, HH2,"") + TEST_RR( "sxtab r14,r",12, HH2,", r",10,HH1,", ror #8") + TEST_R( "sxtb r8, r",7, HH1,"") + TEST_UNSUPPORTED(".word 0xe6acf47a @ sxtab pc,r12, r10, ror #8") + + TEST_R( "rev r0, r",0, VAL1,"") + TEST_R( "rev r14, r",12, VAL2,"") + TEST_UNSUPPORTED(".word 0xe6bfff3c @ rev pc, r12") + + TEST_RR( "sxtah r0, r",0, HH1,", r",1, HH2,"") + TEST_RR( "sxtah r14,r",12, HH2,", r",10,HH1,", ror #8") + TEST_R( "sxth r8, r",7, HH1,"") + TEST_UNSUPPORTED(".word 0xe6bcf47a @ sxtah pc,r12, r10, ror #8") + + TEST_R( "rev16 r0, r",0, VAL1,"") + TEST_R( "rev16 r14, r",12, VAL2,"") + TEST_UNSUPPORTED(".word 0xe6bfffbc @ rev16 pc, r12") + + TEST_RR( "uxtab16 r0, r",0, HH1,", r",1, HH2,"") + TEST_RR( "uxtab16 r14,r",12, HH2,", r",10,HH1,", ror #8") + TEST_R( "uxtb16 r8, r",7, HH1,"") + TEST_UNSUPPORTED(".word 0xe6ccf47a @ uxtab16 pc,r12, r10, ror #8") + + TEST_R( "usat16 r0, #12, r",0, HH1,"") + TEST_R( "usat16 r14, #12, r",12, HH2,"") + TEST_UNSUPPORTED(".word 0xe6ecff3c @ usat16 pc, #12, r12") + TEST_UNSUPPORTED(".word 0xe6ecef3f @ usat16 r14, #12, pc") + + TEST_RR( "uxtab r0, r",0, HH1,", r",1, HH2,"") + TEST_RR( "uxtab r14,r",12, HH2,", r",10,HH1,", ror #8") + TEST_R( "uxtb r8, r",7, HH1,"") + TEST_UNSUPPORTED(".word 0xe6ecf47a @ uxtab pc,r12, r10, ror #8") + +#if __LINUX_ARM_ARCH__ >= 7 + TEST_R( "rbit r0, r",0, VAL1,"") + TEST_R( "rbit r14, r",12, VAL2,"") + TEST_UNSUPPORTED(".word 0xe6ffff3c @ rbit pc, r12") +#endif + + TEST_RR( "uxtah r0, r",0, HH1,", r",1, HH2,"") + TEST_RR( "uxtah r14,r",12, HH2,", r",10,HH1,", ror #8") + TEST_R( "uxth r8, r",7, HH1,"") + TEST_UNSUPPORTED(".word 0xe6fff077 @ uxth pc, r7") + TEST_UNSUPPORTED(".word 0xe6ff807f @ uxth r8, pc") + TEST_UNSUPPORTED(".word 0xe6fcf47a @ uxtah pc, r12, r10, ror #8") + TEST_UNSUPPORTED(".word 0xe6fce47f @ uxtah r14, r12, pc, ror #8") + + TEST_R( "revsh r0, r",0, VAL1,"") + TEST_R( "revsh r14, r",12, VAL2,"") + TEST_UNSUPPORTED(".word 0xe6ffff3c @ revsh pc, r12") + TEST_UNSUPPORTED(".word 0xe6ffef3f @ revsh r14, pc") + + TEST_UNSUPPORTED(".word 0xe6900070") /* Unallocated space */ + TEST_UNSUPPORTED(".word 0xe69fff7f") /* Unallocated space */ + + TEST_UNSUPPORTED(".word 0xe6d00070") /* Unallocated space */ + TEST_UNSUPPORTED(".word 0xe6dfff7f") /* Unallocated space */ +#endif /* __LINUX_ARM_ARCH__ >= 6 */ + +#if __LINUX_ARM_ARCH__ >= 6 + TEST_GROUP("Signed multiplies") + + TEST_RRR( "smlad r0, r",0, HH1,", r",1, HH2,", r",2, VAL1,"") + TEST_RRR( "smlad r14, r",12,HH2,", r",10,HH1,", r",8, VAL2,"") + TEST_UNSUPPORTED(".word 0xe70f8a1c @ smlad pc, r12, r10, r8") + TEST_RRR( "smladx r0, r",0, HH1,", r",1, HH2,", r",2, VAL1,"") + TEST_RRR( "smladx r14, r",12,HH2,", r",10,HH1,", r",8, VAL2,"") + TEST_UNSUPPORTED(".word 0xe70f8a3c @ smladx pc, r12, r10, r8") + + TEST_RR( "smuad r0, r",0, HH1,", r",1, HH2,"") + TEST_RR( "smuad r14, r",12,HH2,", r",10,HH1,"") + TEST_UNSUPPORTED(".word 0xe70ffa1c @ smuad pc, r12, r10") + TEST_RR( "smuadx r0, r",0, HH1,", r",1, HH2,"") + TEST_RR( "smuadx r14, r",12,HH2,", r",10,HH1,"") + TEST_UNSUPPORTED(".word 0xe70ffa3c @ smuadx pc, r12, r10") + + TEST_RRR( "smlsd r0, r",0, HH1,", r",1, HH2,", r",2, VAL1,"") + TEST_RRR( "smlsd r14, r",12,HH2,", r",10,HH1,", r",8, VAL2,"") + TEST_UNSUPPORTED(".word 0xe70f8a5c @ smlsd pc, r12, r10, r8") + TEST_RRR( "smlsdx r0, r",0, HH1,", r",1, HH2,", r",2, VAL1,"") + TEST_RRR( "smlsdx r14, r",12,HH2,", r",10,HH1,", r",8, VAL2,"") + TEST_UNSUPPORTED(".word 0xe70f8a7c @ smlsdx pc, r12, r10, r8") + + TEST_RR( "smusd r0, r",0, HH1,", r",1, HH2,"") + TEST_RR( "smusd r14, r",12,HH2,", r",10,HH1,"") + TEST_UNSUPPORTED(".word 0xe70ffa5c @ smusd pc, r12, r10") + TEST_RR( "smusdx r0, r",0, HH1,", r",1, HH2,"") + TEST_RR( "smusdx r14, r",12,HH2,", r",10,HH1,"") + TEST_UNSUPPORTED(".word 0xe70ffa7c @ smusdx pc, r12, r10") + + TEST_RRRR( "smlald r",0, VAL1,", r",1, VAL2, ", r",0, HH1,", r",1, HH2) + TEST_RRRR( "smlald r",11,VAL2,", r",10,VAL1, ", r",9, HH2,", r",8, HH1) + TEST_UNSUPPORTED(".word 0xe74af819 @ smlald pc, r10, r9, r8") + TEST_UNSUPPORTED(".word 0xe74fb819 @ smlald r11, pc, r9, r8") + TEST_UNSUPPORTED(".word 0xe74ab81f @ smlald r11, r10, pc, r8") + TEST_UNSUPPORTED(".word 0xe74abf19 @ smlald r11, r10, r9, pc") + + TEST_RRRR( "smlaldx r",0, VAL1,", r",1, VAL2, ", r",0, HH1,", r",1, HH2) + TEST_RRRR( "smlaldx r",11,VAL2,", r",10,VAL1, ", r",9, HH2,", r",8, HH1) + TEST_UNSUPPORTED(".word 0xe74af839 @ smlaldx pc, r10, r9, r8") + TEST_UNSUPPORTED(".word 0xe74fb839 @ smlaldx r11, pc, r9, r8") + + TEST_RRR( "smmla r0, r",0, VAL1,", r",1, VAL2,", r",2, VAL1,"") + TEST_RRR( "smmla r14, r",12,VAL2,", r",10,VAL1,", r",8, VAL2,"") + TEST_UNSUPPORTED(".word 0xe75f8a1c @ smmla pc, r12, r10, r8") + TEST_RRR( "smmlar r0, r",0, VAL1,", r",1, VAL2,", r",2, VAL1,"") + TEST_RRR( "smmlar r14, r",12,VAL2,", r",10,VAL1,", r",8, VAL2,"") + TEST_UNSUPPORTED(".word 0xe75f8a3c @ smmlar pc, r12, r10, r8") + + TEST_RR( "smmul r0, r",0, VAL1,", r",1, VAL2,"") + TEST_RR( "smmul r14, r",12,VAL2,", r",10,VAL1,"") + TEST_UNSUPPORTED(".word 0xe75ffa1c @ smmul pc, r12, r10") + TEST_RR( "smmulr r0, r",0, VAL1,", r",1, VAL2,"") + TEST_RR( "smmulr r14, r",12,VAL2,", r",10,VAL1,"") + TEST_UNSUPPORTED(".word 0xe75ffa3c @ smmulr pc, r12, r10") + + TEST_RRR( "smmls r0, r",0, VAL1,", r",1, VAL2,", r",2, VAL1,"") + TEST_RRR( "smmls r14, r",12,VAL2,", r",10,VAL1,", r",8, VAL2,"") + TEST_UNSUPPORTED(".word 0xe75f8adc @ smmls pc, r12, r10, r8") + TEST_RRR( "smmlsr r0, r",0, VAL1,", r",1, VAL2,", r",2, VAL1,"") + TEST_RRR( "smmlsr r14, r",12,VAL2,", r",10,VAL1,", r",8, VAL2,"") + TEST_UNSUPPORTED(".word 0xe75f8afc @ smmlsr pc, r12, r10, r8") + TEST_UNSUPPORTED(".word 0xe75e8aff @ smmlsr r14, pc, r10, r8") + TEST_UNSUPPORTED(".word 0xe75e8ffc @ smmlsr r14, r12, pc, r8") + TEST_UNSUPPORTED(".word 0xe75efafc @ smmlsr r14, r12, r10, pc") + + TEST_RR( "usad8 r0, r",0, VAL1,", r",1, VAL2,"") + TEST_RR( "usad8 r14, r",12,VAL2,", r",10,VAL1,"") + TEST_UNSUPPORTED(".word 0xe75ffa1c @ usad8 pc, r12, r10") + TEST_UNSUPPORTED(".word 0xe75efa1f @ usad8 r14, pc, r10") + TEST_UNSUPPORTED(".word 0xe75eff1c @ usad8 r14, r12, pc") + + TEST_RRR( "usada8 r0, r",0, VAL1,", r",1, VAL2,", r",2, VAL3,"") + TEST_RRR( "usada8 r14, r",12,VAL2,", r",10,VAL1,", r",8, VAL3,"") + TEST_UNSUPPORTED(".word 0xe78f8a1c @ usada8 pc, r12, r10, r8") + TEST_UNSUPPORTED(".word 0xe78e8a1f @ usada8 r14, pc, r10, r8") + TEST_UNSUPPORTED(".word 0xe78e8f1c @ usada8 r14, r12, pc, r8") +#endif /* __LINUX_ARM_ARCH__ >= 6 */ + +#if __LINUX_ARM_ARCH__ >= 7 + TEST_GROUP("Bit Field") + + TEST_R( "sbfx r0, r",0 , VAL1,", #0, #31") + TEST_R( "sbfxeq r14, r",12, VAL2,", #8, #16") + TEST_R( "sbfx r4, r",10, VAL1,", #16, #15") + TEST_UNSUPPORTED(".word 0xe7aff45c @ sbfx pc, r12, #8, #16") + + TEST_R( "ubfx r0, r",0 , VAL1,", #0, #31") + TEST_R( "ubfxcs r14, r",12, VAL2,", #8, #16") + TEST_R( "ubfx r4, r",10, VAL1,", #16, #15") + TEST_UNSUPPORTED(".word 0xe7eff45c @ ubfx pc, r12, #8, #16") + TEST_UNSUPPORTED(".word 0xe7efc45f @ ubfx r12, pc, #8, #16") + + TEST_R( "bfc r",0, VAL1,", #4, #20") + TEST_R( "bfcvs r",14,VAL2,", #4, #20") + TEST_R( "bfc r",7, VAL1,", #0, #31") + TEST_R( "bfc r",8, VAL2,", #0, #31") + TEST_UNSUPPORTED(".word 0xe7def01f @ bfc pc, #0, #31"); + + TEST_RR( "bfi r",0, VAL1,", r",0 , VAL2,", #0, #31") + TEST_RR( "bfipl r",12,VAL1,", r",14 , VAL2,", #4, #20") + TEST_UNSUPPORTED(".word 0xe7d7f21e @ bfi pc, r14, #4, #20") + + TEST_UNSUPPORTED(".word 0x07f000f0") /* Permanently UNDEFINED */ + TEST_UNSUPPORTED(".word 0x07ffffff") /* Permanently UNDEFINED */ +#endif /* __LINUX_ARM_ARCH__ >= 6 */ + + TEST_GROUP("Branch, branch with link, and block data transfer") + + TEST_P( "stmda r",0, 16*4,", {r0}") + TEST_P( "stmeqda r",4, 16*4,", {r0-r15}") + TEST_P( "stmneda r",8, 16*4,"!, {r8-r15}") + TEST_P( "stmda r",12,16*4,"!, {r1,r3,r5,r7,r8-r11,r14}") + TEST_P( "stmda r",13,0, "!, {pc}") + + TEST_P( "ldmda r",0, 16*4,", {r0}") + TEST_BF_P("ldmcsda r",4, 15*4,", {r0-r15}") + TEST_BF_P("ldmccda r",7, 15*4,"!, {r8-r15}") + TEST_P( "ldmda r",12,16*4,"!, {r1,r3,r5,r7,r8-r11,r14}") + TEST_BF_P("ldmda r",14,15*4,"!, {pc}") + + TEST_P( "stmia r",0, 16*4,", {r0}") + TEST_P( "stmmiia r",4, 16*4,", {r0-r15}") + TEST_P( "stmplia r",8, 16*4,"!, {r8-r15}") + TEST_P( "stmia r",12,16*4,"!, {r1,r3,r5,r7,r8-r11,r14}") + TEST_P( "stmia r",14,0, "!, {pc}") + + TEST_P( "ldmia r",0, 16*4,", {r0}") + TEST_BF_P("ldmvsia r",4, 0, ", {r0-r15}") + TEST_BF_P("ldmvcia r",7, 8*4, "!, {r8-r15}") + TEST_P( "ldmia r",12,16*4,"!, {r1,r3,r5,r7,r8-r11,r14}") + TEST_BF_P("ldmia r",14,15*4,"!, {pc}") + + TEST_P( "stmdb r",0, 16*4,", {r0}") + TEST_P( "stmhidb r",4, 16*4,", {r0-r15}") + TEST_P( "stmlsdb r",8, 16*4,"!, {r8-r15}") + TEST_P( "stmdb r",12,16*4,"!, {r1,r3,r5,r7,r8-r11,r14}") + TEST_P( "stmdb r",13,4, "!, {pc}") + + TEST_P( "ldmdb r",0, 16*4,", {r0}") + TEST_BF_P("ldmgedb r",4, 16*4,", {r0-r15}") + TEST_BF_P("ldmltdb r",7, 16*4,"!, {r8-r15}") + TEST_P( "ldmdb r",12,16*4,"!, {r1,r3,r5,r7,r8-r11,r14}") + TEST_BF_P("ldmdb r",14,16*4,"!, {pc}") + + TEST_P( "stmib r",0, 16*4,", {r0}") + TEST_P( "stmgtib r",4, 16*4,", {r0-r15}") + TEST_P( "stmleib r",8, 16*4,"!, {r8-r15}") + TEST_P( "stmib r",12,16*4,"!, {r1,r3,r5,r7,r8-r11,r14}") + TEST_P( "stmib r",13,-4, "!, {pc}") + + TEST_P( "ldmib r",0, 16*4,", {r0}") + TEST_BF_P("ldmeqib r",4, -4,", {r0-r15}") + TEST_BF_P("ldmneib r",7, 7*4,"!, {r8-r15}") + TEST_P( "ldmib r",12,16*4,"!, {r1,r3,r5,r7,r8-r11,r14}") + TEST_BF_P("ldmib r",14,14*4,"!, {pc}") + + TEST_P( "stmdb r",13,16*4,"!, {r3-r12,lr}") + TEST_P( "stmeqdb r",13,16*4,"!, {r3-r12}") + TEST_P( "stmnedb r",2, 16*4,", {r3-r12,lr}") + TEST_P( "stmdb r",13,16*4,"!, {r2-r12,lr}") + TEST_P( "stmdb r",0, 16*4,", {r0-r12}") + TEST_P( "stmdb r",0, 16*4,", {r0-r12,lr}") + + TEST_BF_P("ldmia r",13,5*4, "!, {r3-r12,pc}") + TEST_P( "ldmccia r",13,5*4, "!, {r3-r12}") + TEST_BF_P("ldmcsia r",2, 5*4, "!, {r3-r12,pc}") + TEST_BF_P("ldmia r",13,4*4, "!, {r2-r12,pc}") + TEST_P( "ldmia r",0, 16*4,", {r0-r12}") + TEST_P( "ldmia r",0, 16*4,", {r0-r12,lr}") + +#ifdef CONFIG_THUMB2_KERNEL + TEST_ARM_TO_THUMB_INTERWORK_P("ldmplia r",0,15*4,", {pc}") + TEST_ARM_TO_THUMB_INTERWORK_P("ldmmiia r",13,0,", {r0-r15}") +#endif + TEST_BF("b 2f") + TEST_BF("bl 2f") + TEST_BB("b 2b") + TEST_BB("bl 2b") + + TEST_BF("beq 2f") + TEST_BF("bleq 2f") + TEST_BB("bne 2b") + TEST_BB("blne 2b") + + TEST_BF("bgt 2f") + TEST_BF("blgt 2f") + TEST_BB("blt 2b") + TEST_BB("bllt 2b") + + TEST_GROUP("Supervisor Call, and coprocessor instructions") + + /* + * We can't really test these by executing them, so all + * we can do is check that probes are, or are not allowed. + * At the moment none are allowed... + */ +#define TEST_COPROCESSOR(code) TEST_UNSUPPORTED(code) + +#define COPROCESSOR_INSTRUCTIONS_ST_LD(two,cc) \ + TEST_COPROCESSOR("stc"two" 0, cr0, [r13, #4]") \ + TEST_COPROCESSOR("stc"two" 0, cr0, [r13, #-4]") \ + TEST_COPROCESSOR("stc"two" 0, cr0, [r13, #4]!") \ + TEST_COPROCESSOR("stc"two" 0, cr0, [r13, #-4]!") \ + TEST_COPROCESSOR("stc"two" 0, cr0, [r13], #4") \ + TEST_COPROCESSOR("stc"two" 0, cr0, [r13], #-4") \ + TEST_COPROCESSOR("stc"two" 0, cr0, [r13], {1}") \ + TEST_COPROCESSOR("stc"two"l 0, cr0, [r13, #4]") \ + TEST_COPROCESSOR("stc"two"l 0, cr0, [r13, #-4]") \ + TEST_COPROCESSOR("stc"two"l 0, cr0, [r13, #4]!") \ + TEST_COPROCESSOR("stc"two"l 0, cr0, [r13, #-4]!") \ + TEST_COPROCESSOR("stc"two"l 0, cr0, [r13], #4") \ + TEST_COPROCESSOR("stc"two"l 0, cr0, [r13], #-4") \ + TEST_COPROCESSOR("stc"two"l 0, cr0, [r13], {1}") \ + TEST_COPROCESSOR("ldc"two" 0, cr0, [r13, #4]") \ + TEST_COPROCESSOR("ldc"two" 0, cr0, [r13, #-4]") \ + TEST_COPROCESSOR("ldc"two" 0, cr0, [r13, #4]!") \ + TEST_COPROCESSOR("ldc"two" 0, cr0, [r13, #-4]!") \ + TEST_COPROCESSOR("ldc"two" 0, cr0, [r13], #4") \ + TEST_COPROCESSOR("ldc"two" 0, cr0, [r13], #-4") \ + TEST_COPROCESSOR("ldc"two" 0, cr0, [r13], {1}") \ + TEST_COPROCESSOR("ldc"two"l 0, cr0, [r13, #4]") \ + TEST_COPROCESSOR("ldc"two"l 0, cr0, [r13, #-4]") \ + TEST_COPROCESSOR("ldc"two"l 0, cr0, [r13, #4]!") \ + TEST_COPROCESSOR("ldc"two"l 0, cr0, [r13, #-4]!") \ + TEST_COPROCESSOR("ldc"two"l 0, cr0, [r13], #4") \ + TEST_COPROCESSOR("ldc"two"l 0, cr0, [r13], #-4") \ + TEST_COPROCESSOR("ldc"two"l 0, cr0, [r13], {1}") \ + \ + TEST_COPROCESSOR( "stc"two" 0, cr0, [r15, #4]") \ + TEST_COPROCESSOR( "stc"two" 0, cr0, [r15, #-4]") \ + TEST_UNSUPPORTED(".word 0x"cc"daf0001 @ stc"two" 0, cr0, [r15, #4]!") \ + TEST_UNSUPPORTED(".word 0x"cc"d2f0001 @ stc"two" 0, cr0, [r15, #-4]!") \ + TEST_UNSUPPORTED(".word 0x"cc"caf0001 @ stc"two" 0, cr0, [r15], #4") \ + TEST_UNSUPPORTED(".word 0x"cc"c2f0001 @ stc"two" 0, cr0, [r15], #-4") \ + TEST_COPROCESSOR( "stc"two" 0, cr0, [r15], {1}") \ + TEST_COPROCESSOR( "stc"two"l 0, cr0, [r15, #4]") \ + TEST_COPROCESSOR( "stc"two"l 0, cr0, [r15, #-4]") \ + TEST_UNSUPPORTED(".word 0x"cc"def0001 @ stc"two"l 0, cr0, [r15, #4]!") \ + TEST_UNSUPPORTED(".word 0x"cc"d6f0001 @ stc"two"l 0, cr0, [r15, #-4]!") \ + TEST_UNSUPPORTED(".word 0x"cc"cef0001 @ stc"two"l 0, cr0, [r15], #4") \ + TEST_UNSUPPORTED(".word 0x"cc"c6f0001 @ stc"two"l 0, cr0, [r15], #-4") \ + TEST_COPROCESSOR( "stc"two"l 0, cr0, [r15], {1}") \ + TEST_COPROCESSOR( "ldc"two" 0, cr0, [r15, #4]") \ + TEST_COPROCESSOR( "ldc"two" 0, cr0, [r15, #-4]") \ + TEST_UNSUPPORTED(".word 0x"cc"dbf0001 @ ldc"two" 0, cr0, [r15, #4]!") \ + TEST_UNSUPPORTED(".word 0x"cc"d3f0001 @ ldc"two" 0, cr0, [r15, #-4]!") \ + TEST_UNSUPPORTED(".word 0x"cc"cbf0001 @ ldc"two" 0, cr0, [r15], #4") \ + TEST_UNSUPPORTED(".word 0x"cc"c3f0001 @ ldc"two" 0, cr0, [r15], #-4") \ + TEST_COPROCESSOR( "ldc"two" 0, cr0, [r15], {1}") \ + TEST_COPROCESSOR( "ldc"two"l 0, cr0, [r15, #4]") \ + TEST_COPROCESSOR( "ldc"two"l 0, cr0, [r15, #-4]") \ + TEST_UNSUPPORTED(".word 0x"cc"dff0001 @ ldc"two"l 0, cr0, [r15, #4]!") \ + TEST_UNSUPPORTED(".word 0x"cc"d7f0001 @ ldc"two"l 0, cr0, [r15, #-4]!") \ + TEST_UNSUPPORTED(".word 0x"cc"cff0001 @ ldc"two"l 0, cr0, [r15], #4") \ + TEST_UNSUPPORTED(".word 0x"cc"c7f0001 @ ldc"two"l 0, cr0, [r15], #-4") \ + TEST_COPROCESSOR( "ldc"two"l 0, cr0, [r15], {1}") + +#define COPROCESSOR_INSTRUCTIONS_MC_MR(two,cc) \ + \ + TEST_COPROCESSOR( "mcrr"two" 0, 15, r0, r14, cr0") \ + TEST_COPROCESSOR( "mcrr"two" 15, 0, r14, r0, cr15") \ + TEST_UNSUPPORTED(".word 0x"cc"c4f00f0 @ mcrr"two" 0, 15, r0, r15, cr0") \ + TEST_UNSUPPORTED(".word 0x"cc"c40ff0f @ mcrr"two" 15, 0, r15, r0, cr15") \ + TEST_COPROCESSOR( "mrrc"two" 0, 15, r0, r14, cr0") \ + TEST_COPROCESSOR( "mrrc"two" 15, 0, r14, r0, cr15") \ + TEST_UNSUPPORTED(".word 0x"cc"c5f00f0 @ mrrc"two" 0, 15, r0, r15, cr0") \ + TEST_UNSUPPORTED(".word 0x"cc"c50ff0f @ mrrc"two" 15, 0, r15, r0, cr15") \ + TEST_COPROCESSOR( "cdp"two" 15, 15, cr15, cr15, cr15, 7") \ + TEST_COPROCESSOR( "cdp"two" 0, 0, cr0, cr0, cr0, 0") \ + TEST_COPROCESSOR( "mcr"two" 15, 7, r15, cr15, cr15, 7") \ + TEST_COPROCESSOR( "mcr"two" 0, 0, r0, cr0, cr0, 0") \ + TEST_COPROCESSOR( "mrc"two" 15, 7, r15, cr15, cr15, 7") \ + TEST_COPROCESSOR( "mrc"two" 0, 0, r0, cr0, cr0, 0") + + COPROCESSOR_INSTRUCTIONS_ST_LD("","e") + COPROCESSOR_INSTRUCTIONS_MC_MR("","e") + TEST_UNSUPPORTED("svc 0") + TEST_UNSUPPORTED("svc 0xffffff") + + TEST_UNSUPPORTED("svc 0") + + TEST_GROUP("Unconditional instruction") + +#if __LINUX_ARM_ARCH__ >= 6 + TEST_UNSUPPORTED("srsda sp, 0x13") + TEST_UNSUPPORTED("srsdb sp, 0x13") + TEST_UNSUPPORTED("srsia sp, 0x13") + TEST_UNSUPPORTED("srsib sp, 0x13") + TEST_UNSUPPORTED("srsda sp!, 0x13") + TEST_UNSUPPORTED("srsdb sp!, 0x13") + TEST_UNSUPPORTED("srsia sp!, 0x13") + TEST_UNSUPPORTED("srsib sp!, 0x13") + + TEST_UNSUPPORTED("rfeda sp") + TEST_UNSUPPORTED("rfedb sp") + TEST_UNSUPPORTED("rfeia sp") + TEST_UNSUPPORTED("rfeib sp") + TEST_UNSUPPORTED("rfeda sp!") + TEST_UNSUPPORTED("rfedb sp!") + TEST_UNSUPPORTED("rfeia sp!") + TEST_UNSUPPORTED("rfeib sp!") + TEST_UNSUPPORTED(".word 0xf81d0a00 @ rfeda pc") + TEST_UNSUPPORTED(".word 0xf91d0a00 @ rfedb pc") + TEST_UNSUPPORTED(".word 0xf89d0a00 @ rfeia pc") + TEST_UNSUPPORTED(".word 0xf99d0a00 @ rfeib pc") + TEST_UNSUPPORTED(".word 0xf83d0a00 @ rfeda pc!") + TEST_UNSUPPORTED(".word 0xf93d0a00 @ rfedb pc!") + TEST_UNSUPPORTED(".word 0xf8bd0a00 @ rfeia pc!") + TEST_UNSUPPORTED(".word 0xf9bd0a00 @ rfeib pc!") +#endif /* __LINUX_ARM_ARCH__ >= 6 */ + +#if __LINUX_ARM_ARCH__ >= 6 + TEST_X( "blx __dummy_thumb_subroutine_even", + ".thumb \n\t" + ".space 4 \n\t" + ".type __dummy_thumb_subroutine_even, %%function \n\t" + "__dummy_thumb_subroutine_even: \n\t" + "mov r0, pc \n\t" + "bx lr \n\t" + ".arm \n\t" + ) + TEST( "blx __dummy_thumb_subroutine_even") + + TEST_X( "blx __dummy_thumb_subroutine_odd", + ".thumb \n\t" + ".space 2 \n\t" + ".type __dummy_thumb_subroutine_odd, %%function \n\t" + "__dummy_thumb_subroutine_odd: \n\t" + "mov r0, pc \n\t" + "bx lr \n\t" + ".arm \n\t" + ) + TEST( "blx __dummy_thumb_subroutine_odd") +#endif /* __LINUX_ARM_ARCH__ >= 6 */ + + COPROCESSOR_INSTRUCTIONS_ST_LD("2","f") +#if __LINUX_ARM_ARCH__ >= 6 + COPROCESSOR_INSTRUCTIONS_MC_MR("2","f") +#endif + + TEST_GROUP("Miscellaneous instructions, memory hints, and Advanced SIMD instructions") + +#if __LINUX_ARM_ARCH__ >= 6 + TEST_UNSUPPORTED("cps 0x13") + TEST_UNSUPPORTED("cpsie i") + TEST_UNSUPPORTED("cpsid i") + TEST_UNSUPPORTED("cpsie i,0x13") + TEST_UNSUPPORTED("cpsid i,0x13") + TEST_UNSUPPORTED("setend le") + TEST_UNSUPPORTED("setend be") +#endif + +#if __LINUX_ARM_ARCH__ >= 7 + TEST_P("pli [r",0,0b,", #16]") + TEST( "pli [pc, #0]") + TEST_RR("pli [r",12,0b,", r",0, 16,"]") + TEST_RR("pli [r",0, 0b,", -r",12,16,", lsl #4]") +#endif + +#if __LINUX_ARM_ARCH__ >= 5 + TEST_P("pld [r",0,32,", #-16]") + TEST( "pld [pc, #0]") + TEST_PR("pld [r",7, 24, ", r",0, 16,"]") + TEST_PR("pld [r",8, 24, ", -r",12,16,", lsl #4]") +#endif + +#if __LINUX_ARM_ARCH__ >= 7 + TEST_SUPPORTED( ".word 0xf590f000 @ pldw [r0, #0]") + TEST_SUPPORTED( ".word 0xf797f000 @ pldw [r7, r0]") + TEST_SUPPORTED( ".word 0xf798f18c @ pldw [r8, r12, lsl #3]"); +#endif + +#if __LINUX_ARM_ARCH__ >= 7 + TEST_UNSUPPORTED("clrex") + TEST_UNSUPPORTED("dsb") + TEST_UNSUPPORTED("dmb") + TEST_UNSUPPORTED("isb") +#endif + + verbose("\n"); +} + diff --git a/arch/arm/kernel/kprobes-test-thumb.c b/arch/arm/kernel/kprobes-test-thumb.c new file mode 100644 index 000000000000..5e726c31c45a --- /dev/null +++ b/arch/arm/kernel/kprobes-test-thumb.c @@ -0,0 +1,1187 @@ +/* + * arch/arm/kernel/kprobes-test-thumb.c + * + * Copyright (C) 2011 Jon Medhurst . + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License version 2 as + * published by the Free Software Foundation. + */ + +#include +#include + +#include "kprobes-test.h" + + +#define TEST_ISA "16" + +#define DONT_TEST_IN_ITBLOCK(tests) \ + kprobe_test_flags |= TEST_FLAG_NO_ITBLOCK; \ + tests \ + kprobe_test_flags &= ~TEST_FLAG_NO_ITBLOCK; + +#define CONDITION_INSTRUCTIONS(cc_pos, tests) \ + kprobe_test_cc_position = cc_pos; \ + DONT_TEST_IN_ITBLOCK(tests) \ + kprobe_test_cc_position = 0; + +#define TEST_ITBLOCK(code) \ + kprobe_test_flags |= TEST_FLAG_FULL_ITBLOCK; \ + TESTCASE_START(code) \ + TEST_ARG_END("") \ + "50: nop \n\t" \ + "1: "code" \n\t" \ + " mov r1, #0x11 \n\t" \ + " mov r2, #0x22 \n\t" \ + " mov r3, #0x33 \n\t" \ + "2: nop \n\t" \ + TESTCASE_END \ + kprobe_test_flags &= ~TEST_FLAG_FULL_ITBLOCK; + +#define TEST_THUMB_TO_ARM_INTERWORK_P(code1, reg, val, code2) \ + TESTCASE_START(code1 #reg code2) \ + TEST_ARG_PTR(reg, val) \ + TEST_ARG_REG(14, 99f+1) \ + TEST_ARG_MEM(15, 3f) \ + TEST_ARG_END("") \ + " nop \n\t" /* To align 1f */ \ + "50: nop \n\t" \ + "1: "code1 #reg code2" \n\t" \ + " bx lr \n\t" \ + ".arm \n\t" \ + "3: adr lr, 2f+1 \n\t" \ + " bx lr \n\t" \ + ".thumb \n\t" \ + "2: nop \n\t" \ + TESTCASE_END + + +void kprobe_thumb16_test_cases(void) +{ + kprobe_test_flags = TEST_FLAG_NARROW_INSTR; + + TEST_GROUP("Shift (immediate), add, subtract, move, and compare") + + TEST_R( "lsls r7, r",0,VAL1,", #5") + TEST_R( "lsls r0, r",7,VAL2,", #11") + TEST_R( "lsrs r7, r",0,VAL1,", #5") + TEST_R( "lsrs r0, r",7,VAL2,", #11") + TEST_R( "asrs r7, r",0,VAL1,", #5") + TEST_R( "asrs r0, r",7,VAL2,", #11") + TEST_RR( "adds r2, r",0,VAL1,", r",7,VAL2,"") + TEST_RR( "adds r5, r",7,VAL2,", r",0,VAL2,"") + TEST_RR( "subs r2, r",0,VAL1,", r",7,VAL2,"") + TEST_RR( "subs r5, r",7,VAL2,", r",0,VAL2,"") + TEST_R( "adds r7, r",0,VAL1,", #5") + TEST_R( "adds r0, r",7,VAL2,", #2") + TEST_R( "subs r7, r",0,VAL1,", #5") + TEST_R( "subs r0, r",7,VAL2,", #2") + TEST( "movs.n r0, #0x5f") + TEST( "movs.n r7, #0xa0") + TEST_R( "cmp.n r",0,0x5e, ", #0x5f") + TEST_R( "cmp.n r",5,0x15f,", #0x5f") + TEST_R( "cmp.n r",7,0xa0, ", #0xa0") + TEST_R( "adds.n r",0,VAL1,", #0x5f") + TEST_R( "adds.n r",7,VAL2,", #0xa0") + TEST_R( "subs.n r",0,VAL1,", #0x5f") + TEST_R( "subs.n r",7,VAL2,", #0xa0") + + TEST_GROUP("16-bit Thumb data-processing instructions") + +#define DATA_PROCESSING16(op,val) \ + TEST_RR( op" r",0,VAL1,", r",7,val,"") \ + TEST_RR( op" r",7,VAL2,", r",0,val,"") + + DATA_PROCESSING16("ands",0xf00f00ff) + DATA_PROCESSING16("eors",0xf00f00ff) + DATA_PROCESSING16("lsls",11) + DATA_PROCESSING16("lsrs",11) + DATA_PROCESSING16("asrs",11) + DATA_PROCESSING16("adcs",VAL2) + DATA_PROCESSING16("sbcs",VAL2) + DATA_PROCESSING16("rors",11) + DATA_PROCESSING16("tst",0xf00f00ff) + TEST_R("rsbs r",0,VAL1,", #0") + TEST_R("rsbs r",7,VAL2,", #0") + DATA_PROCESSING16("cmp",0xf00f00ff) + DATA_PROCESSING16("cmn",0xf00f00ff) + DATA_PROCESSING16("orrs",0xf00f00ff) + DATA_PROCESSING16("muls",VAL2) + DATA_PROCESSING16("bics",0xf00f00ff) + DATA_PROCESSING16("mvns",VAL2) + + TEST_GROUP("Special data instructions and branch and exchange") + + TEST_RR( "add r",0, VAL1,", r",7,VAL2,"") + TEST_RR( "add r",3, VAL2,", r",8,VAL3,"") + TEST_RR( "add r",8, VAL3,", r",0,VAL1,"") + TEST_R( "add sp" ", r",8,-8, "") + TEST_R( "add r",14,VAL1,", pc") + TEST_BF_R("add pc" ", r",0,2f-1f-8,"") + TEST_UNSUPPORTED(".short 0x44ff @ add pc, pc") + + TEST_RR( "cmp r",3,VAL1,", r",8,VAL2,"") + TEST_RR( "cmp r",8,VAL2,", r",0,VAL1,"") + TEST_R( "cmp sp" ", r",8,-8, "") + + TEST_R( "mov r0, r",7,VAL2,"") + TEST_R( "mov r3, r",8,VAL3,"") + TEST_R( "mov r8, r",0,VAL1,"") + TEST_P( "mov sp, r",8,-8, "") + TEST( "mov lr, pc") + TEST_BF_R("mov pc, r",0,2f, "") + + TEST_BF_R("bx r",0, 2f+1,"") + TEST_BF_R("bx r",14,2f+1,"") + TESTCASE_START("bx pc") + TEST_ARG_REG(14, 99f+1) + TEST_ARG_END("") + " nop \n\t" /* To align the bx pc*/ + "50: nop \n\t" + "1: bx pc \n\t" + " bx lr \n\t" + ".arm \n\t" + " adr lr, 2f+1 \n\t" + " bx lr \n\t" + ".thumb \n\t" + "2: nop \n\t" + TESTCASE_END + + TEST_BF_R("blx r",0, 2f+1,"") + TEST_BB_R("blx r",14,2f+1,"") + TEST_UNSUPPORTED(".short 0x47f8 @ blx pc") + + TEST_GROUP("Load from Literal Pool") + + TEST_X( "ldr r0, 3f", + ".align \n\t" + "3: .word "__stringify(VAL1)) + TEST_X( "ldr r7, 3f", + ".space 128 \n\t" + ".align \n\t" + "3: .word "__stringify(VAL2)) + + TEST_GROUP("16-bit Thumb Load/store instructions") + + TEST_RPR("str r",0, VAL1,", [r",1, 24,", r",2, 48,"]") + TEST_RPR("str r",7, VAL2,", [r",6, 24,", r",5, 48,"]") + TEST_RPR("strh r",0, VAL1,", [r",1, 24,", r",2, 48,"]") + TEST_RPR("strh r",7, VAL2,", [r",6, 24,", r",5, 48,"]") + TEST_RPR("strb r",0, VAL1,", [r",1, 24,", r",2, 48,"]") + TEST_RPR("strb r",7, VAL2,", [r",6, 24,", r",5, 48,"]") + TEST_PR( "ldrsb r0, [r",1, 24,", r",2, 48,"]") + TEST_PR( "ldrsb r7, [r",6, 24,", r",5, 50,"]") + TEST_PR( "ldr r0, [r",1, 24,", r",2, 48,"]") + TEST_PR( "ldr r7, [r",6, 24,", r",5, 48,"]") + TEST_PR( "ldrh r0, [r",1, 24,", r",2, 48,"]") + TEST_PR( "ldrh r7, [r",6, 24,", r",5, 50,"]") + TEST_PR( "ldrb r0, [r",1, 24,", r",2, 48,"]") + TEST_PR( "ldrb r7, [r",6, 24,", r",5, 50,"]") + TEST_PR( "ldrsh r0, [r",1, 24,", r",2, 48,"]") + TEST_PR( "ldrsh r7, [r",6, 24,", r",5, 50,"]") + + TEST_RP("str r",0, VAL1,", [r",1, 24,", #120]") + TEST_RP("str r",7, VAL2,", [r",6, 24,", #120]") + TEST_P( "ldr r0, [r",1, 24,", #120]") + TEST_P( "ldr r7, [r",6, 24,", #120]") + TEST_RP("strb r",0, VAL1,", [r",1, 24,", #30]") + TEST_RP("strb r",7, VAL2,", [r",6, 24,", #30]") + TEST_P( "ldrb r0, [r",1, 24,", #30]") + TEST_P( "ldrb r7, [r",6, 24,", #30]") + TEST_RP("strh r",0, VAL1,", [r",1, 24,", #60]") + TEST_RP("strh r",7, VAL2,", [r",6, 24,", #60]") + TEST_P( "ldrh r0, [r",1, 24,", #60]") + TEST_P( "ldrh r7, [r",6, 24,", #60]") + + TEST_R( "str r",0, VAL1,", [sp, #0]") + TEST_R( "str r",7, VAL2,", [sp, #160]") + TEST( "ldr r0, [sp, #0]") + TEST( "ldr r7, [sp, #160]") + + TEST_RP("str r",0, VAL1,", [r",0, 24,"]") + TEST_P( "ldr r0, [r",0, 24,"]") + + TEST_GROUP("Generate PC-/SP-relative address") + + TEST("add r0, pc, #4") + TEST("add r7, pc, #1020") + TEST("add r0, sp, #4") + TEST("add r7, sp, #1020") + + TEST_GROUP("Miscellaneous 16-bit instructions") + + TEST_UNSUPPORTED( "cpsie i") + TEST_UNSUPPORTED( "cpsid i") + TEST_UNSUPPORTED( "setend le") + TEST_UNSUPPORTED( "setend be") + + TEST("add sp, #"__stringify(TEST_MEMORY_SIZE)) /* Assumes TEST_MEMORY_SIZE < 0x400 */ + TEST("sub sp, #0x7f*4") + +DONT_TEST_IN_ITBLOCK( + TEST_BF_R( "cbnz r",0,0, ", 2f") + TEST_BF_R( "cbz r",2,-1,", 2f") + TEST_BF_RX( "cbnz r",4,1, ", 2f",0x20) + TEST_BF_RX( "cbz r",7,0, ", 2f",0x40) +) + TEST_R("sxth r0, r",7, HH1,"") + TEST_R("sxth r7, r",0, HH2,"") + TEST_R("sxtb r0, r",7, HH1,"") + TEST_R("sxtb r7, r",0, HH2,"") + TEST_R("uxth r0, r",7, HH1,"") + TEST_R("uxth r7, r",0, HH2,"") + TEST_R("uxtb r0, r",7, HH1,"") + TEST_R("uxtb r7, r",0, HH2,"") + TEST_R("rev r0, r",7, VAL1,"") + TEST_R("rev r7, r",0, VAL2,"") + TEST_R("rev16 r0, r",7, VAL1,"") + TEST_R("rev16 r7, r",0, VAL2,"") + TEST_UNSUPPORTED(".short 0xba80") + TEST_UNSUPPORTED(".short 0xbabf") + TEST_R("revsh r0, r",7, VAL1,"") + TEST_R("revsh r7, r",0, VAL2,"") + +#define TEST_POPPC(code, offset) \ + TESTCASE_START(code) \ + TEST_ARG_PTR(13, offset) \ + TEST_ARG_END("") \ + TEST_BRANCH_F(code,0) \ + TESTCASE_END + + TEST("push {r0}") + TEST("push {r7}") + TEST("push {r14}") + TEST("push {r0-r7,r14}") + TEST("push {r0,r2,r4,r6,r14}") + TEST("push {r1,r3,r5,r7}") + TEST("pop {r0}") + TEST("pop {r7}") + TEST("pop {r0,r2,r4,r6}") + TEST_POPPC("pop {pc}",15*4) + TEST_POPPC("pop {r0-r7,pc}",7*4) + TEST_POPPC("pop {r1,r3,r5,r7,pc}",11*4) + TEST_THUMB_TO_ARM_INTERWORK_P("pop {pc} @ ",13,15*4,"") + TEST_THUMB_TO_ARM_INTERWORK_P("pop {r0-r7,pc} @ ",13,7*4,"") + + TEST_UNSUPPORTED("bkpt.n 0") + TEST_UNSUPPORTED("bkpt.n 255") + + TEST_SUPPORTED("yield") + TEST("sev") + TEST("nop") + TEST("wfi") + TEST_SUPPORTED("wfe") + TEST_UNSUPPORTED(".short 0xbf50") /* Unassigned hints */ + TEST_UNSUPPORTED(".short 0xbff0") /* Unassigned hints */ + +#define TEST_IT(code, code2) \ + TESTCASE_START(code) \ + TEST_ARG_END("") \ + "50: nop \n\t" \ + "1: "code" \n\t" \ + " "code2" \n\t" \ + "2: nop \n\t" \ + TESTCASE_END + +DONT_TEST_IN_ITBLOCK( + TEST_IT("it eq","moveq r0,#0") + TEST_IT("it vc","movvc r0,#0") + TEST_IT("it le","movle r0,#0") + TEST_IT("ite eq","moveq r0,#0\n\t movne r1,#1") + TEST_IT("itet vc","movvc r0,#0\n\t movvs r1,#1\n\t movvc r2,#2") + TEST_IT("itete le","movle r0,#0\n\t movgt r1,#1\n\t movle r2,#2\n\t movgt r3,#3") + TEST_IT("itttt le","movle r0,#0\n\t movle r1,#1\n\t movle r2,#2\n\t movle r3,#3") + TEST_IT("iteee le","movle r0,#0\n\t movgt r1,#1\n\t movgt r2,#2\n\t movgt r3,#3") +) + + TEST_GROUP("Load and store multiple") + + TEST_P("ldmia r",4, 16*4,"!, {r0,r7}") + TEST_P("ldmia r",7, 16*4,"!, {r0-r6}") + TEST_P("stmia r",4, 16*4,"!, {r0,r7}") + TEST_P("stmia r",0, 16*4,"!, {r0-r7}") + + TEST_GROUP("Conditional branch and Supervisor Call instructions") + +CONDITION_INSTRUCTIONS(8, + TEST_BF("beq 2f") + TEST_BB("bne 2b") + TEST_BF("bgt 2f") + TEST_BB("blt 2b") +) + TEST_UNSUPPORTED(".short 0xde00") + TEST_UNSUPPORTED(".short 0xdeff") + TEST_UNSUPPORTED("svc #0x00") + TEST_UNSUPPORTED("svc #0xff") + + TEST_GROUP("Unconditional branch") + + TEST_BF( "b 2f") + TEST_BB( "b 2b") + TEST_BF_X("b 2f", 0x400) + TEST_BB_X("b 2b", 0x400) + + TEST_GROUP("Testing instructions in IT blocks") + + TEST_ITBLOCK("subs.n r0, r0") + + verbose("\n"); +} + + +void kprobe_thumb32_test_cases(void) +{ + kprobe_test_flags = 0; + + TEST_GROUP("Load/store multiple") + + TEST_UNSUPPORTED("rfedb sp") + TEST_UNSUPPORTED("rfeia sp") + TEST_UNSUPPORTED("rfedb sp!") + TEST_UNSUPPORTED("rfeia sp!") + + TEST_P( "stmia r",0, 16*4,", {r0,r8}") + TEST_P( "stmia r",4, 16*4,", {r0-r12,r14}") + TEST_P( "stmia r",7, 16*4,"!, {r8-r12,r14}") + TEST_P( "stmia r",12,16*4,"!, {r1,r3,r5,r7,r8-r11,r14}") + + TEST_P( "ldmia r",0, 16*4,", {r0,r8}") + TEST_P( "ldmia r",4, 0, ", {r0-r12,r14}") + TEST_BF_P("ldmia r",5, 8*4, "!, {r6-r12,r15}") + TEST_P( "ldmia r",12,16*4,"!, {r1,r3,r5,r7,r8-r11,r14}") + TEST_BF_P("ldmia r",14,14*4,"!, {r4,pc}") + + TEST_P( "stmdb r",0, 16*4,", {r0,r8}") + TEST_P( "stmdb r",4, 16*4,", {r0-r12,r14}") + TEST_P( "stmdb r",5, 16*4,"!, {r8-r12,r14}") + TEST_P( "stmdb r",12,16*4,"!, {r1,r3,r5,r7,r8-r11,r14}") + + TEST_P( "ldmdb r",0, 16*4,", {r0,r8}") + TEST_P( "ldmdb r",4, 16*4,", {r0-r12,r14}") + TEST_BF_P("ldmdb r",5, 16*4,"!, {r6-r12,r15}") + TEST_P( "ldmdb r",12,16*4,"!, {r1,r3,r5,r7,r8-r11,r14}") + TEST_BF_P("ldmdb r",14,16*4,"!, {r4,pc}") + + TEST_P( "stmdb r",13,16*4,"!, {r3-r12,lr}") + TEST_P( "stmdb r",13,16*4,"!, {r3-r12}") + TEST_P( "stmdb r",2, 16*4,", {r3-r12,lr}") + TEST_P( "stmdb r",13,16*4,"!, {r2-r12,lr}") + TEST_P( "stmdb r",0, 16*4,", {r0-r12}") + TEST_P( "stmdb r",0, 16*4,", {r0-r12,lr}") + + TEST_BF_P("ldmia r",13,5*4, "!, {r3-r12,pc}") + TEST_P( "ldmia r",13,5*4, "!, {r3-r12}") + TEST_BF_P("ldmia r",2, 5*4, "!, {r3-r12,pc}") + TEST_BF_P("ldmia r",13,4*4, "!, {r2-r12,pc}") + TEST_P( "ldmia r",0, 16*4,", {r0-r12}") + TEST_P( "ldmia r",0, 16*4,", {r0-r12,lr}") + + TEST_THUMB_TO_ARM_INTERWORK_P("ldmia r",0,14*4,", {r12,pc}") + TEST_THUMB_TO_ARM_INTERWORK_P("ldmia r",13,2*4,", {r0-r12,pc}") + + TEST_UNSUPPORTED(".short 0xe88f,0x0101 @ stmia pc, {r0,r8}") + TEST_UNSUPPORTED(".short 0xe92f,0x5f00 @ stmdb pc!, {r8-r12,r14}") + TEST_UNSUPPORTED(".short 0xe8bd,0xc000 @ ldmia r13!, {r14,pc}") + TEST_UNSUPPORTED(".short 0xe93e,0xc000 @ ldmdb r14!, {r14,pc}") + TEST_UNSUPPORTED(".short 0xe8a7,0x3f00 @ stmia r7!, {r8-r12,sp}") + TEST_UNSUPPORTED(".short 0xe8a7,0x9f00 @ stmia r7!, {r8-r12,pc}") + TEST_UNSUPPORTED(".short 0xe93e,0x2010 @ ldmdb r14!, {r4,sp}") + + TEST_GROUP("Load/store double or exclusive, table branch") + + TEST_P( "ldrd r0, r1, [r",1, 24,", #-16]") + TEST( "ldrd r12, r14, [sp, #16]") + TEST_P( "ldrd r1, r0, [r",7, 24,", #-16]!") + TEST( "ldrd r14, r12, [sp, #16]!") + TEST_P( "ldrd r1, r0, [r",7, 24,"], #16") + TEST( "ldrd r7, r8, [sp], #-16") + + TEST_X( "ldrd r12, r14, 3f", + ".align 3 \n\t" + "3: .word "__stringify(VAL1)" \n\t" + " .word "__stringify(VAL2)) + + TEST_UNSUPPORTED(".short 0xe9ff,0xec04 @ ldrd r14, r12, [pc, #16]!") + TEST_UNSUPPORTED(".short 0xe8ff,0xec04 @ ldrd r14, r12, [pc], #16") + TEST_UNSUPPORTED(".short 0xe9d4,0xd800 @ ldrd sp, r8, [r4]") + TEST_UNSUPPORTED(".short 0xe9d4,0xf800 @ ldrd pc, r8, [r4]") + TEST_UNSUPPORTED(".short 0xe9d4,0x7d00 @ ldrd r7, sp, [r4]") + TEST_UNSUPPORTED(".short 0xe9d4,0x7f00 @ ldrd r7, pc, [r4]") + + TEST_RRP("strd r",0, VAL1,", r",1, VAL2,", [r",1, 24,", #-16]") + TEST_RR( "strd r",12,VAL2,", r",14,VAL1,", [sp, #16]") + TEST_RRP("strd r",1, VAL1,", r",0, VAL2,", [r",7, 24,", #-16]!") + TEST_RR( "strd r",14,VAL2,", r",12,VAL1,", [sp, #16]!") + TEST_RRP("strd r",1, VAL1,", r",0, VAL2,", [r",7, 24,"], #16") + TEST_RR( "strd r",7, VAL2,", r",8, VAL1,", [sp], #-16") + TEST_UNSUPPORTED(".short 0xe9ef,0xec04 @ strd r14, r12, [pc, #16]!") + TEST_UNSUPPORTED(".short 0xe8ef,0xec04 @ strd r14, r12, [pc], #16") + + TEST_RX("tbb [pc, r",0, (9f-(1f+4)),"]", + "9: \n\t" + ".byte (2f-1b-4)>>1 \n\t" + ".byte (3f-1b-4)>>1 \n\t" + "3: mvn r0, r0 \n\t" + "2: nop \n\t") + + TEST_RX("tbb [pc, r",4, (9f-(1f+4)+1),"]", + "9: \n\t" + ".byte (2f-1b-4)>>1 \n\t" + ".byte (3f-1b-4)>>1 \n\t" + "3: mvn r0, r0 \n\t" + "2: nop \n\t") + + TEST_RRX("tbb [r",1,9f,", r",2,0,"]", + "9: \n\t" + ".byte (2f-1b-4)>>1 \n\t" + ".byte (3f-1b-4)>>1 \n\t" + "3: mvn r0, r0 \n\t" + "2: nop \n\t") + + TEST_RX("tbh [pc, r",7, (9f-(1f+4))>>1,"]", + "9: \n\t" + ".short (2f-1b-4)>>1 \n\t" + ".short (3f-1b-4)>>1 \n\t" + "3: mvn r0, r0 \n\t" + "2: nop \n\t") + + TEST_RX("tbh [pc, r",12, ((9f-(1f+4))>>1)+1,"]", + "9: \n\t" + ".short (2f-1b-4)>>1 \n\t" + ".short (3f-1b-4)>>1 \n\t" + "3: mvn r0, r0 \n\t" + "2: nop \n\t") + + TEST_RRX("tbh [r",1,9f, ", r",14,1,"]", + "9: \n\t" + ".short (2f-1b-4)>>1 \n\t" + ".short (3f-1b-4)>>1 \n\t" + "3: mvn r0, r0 \n\t" + "2: nop \n\t") + + TEST_UNSUPPORTED(".short 0xe8d1,0xf01f @ tbh [r1, pc]") + TEST_UNSUPPORTED(".short 0xe8d1,0xf01d @ tbh [r1, sp]") + TEST_UNSUPPORTED(".short 0xe8dd,0xf012 @ tbh [sp, r2]") + + TEST_UNSUPPORTED("strexb r0, r1, [r2]") + TEST_UNSUPPORTED("strexh r0, r1, [r2]") + TEST_UNSUPPORTED("strexd r0, r1, [r2]") + TEST_UNSUPPORTED("ldrexb r0, [r1]") + TEST_UNSUPPORTED("ldrexh r0, [r1]") + TEST_UNSUPPORTED("ldrexd r0, [r1]") + + TEST_GROUP("Data-processing (shifted register) and (modified immediate)") + +#define _DATA_PROCESSING32_DNM(op,s,val) \ + TEST_RR(op s".w r0, r",1, VAL1,", r",2, val, "") \ + TEST_RR(op s" r1, r",1, VAL1,", r",2, val, ", lsl #3") \ + TEST_RR(op s" r2, r",3, VAL1,", r",2, val, ", lsr #4") \ + TEST_RR(op s" r3, r",3, VAL1,", r",2, val, ", asr #5") \ + TEST_RR(op s" r4, r",5, VAL1,", r",2, N(val),", asr #6") \ + TEST_RR(op s" r5, r",5, VAL1,", r",2, val, ", ror #7") \ + TEST_RR(op s" r8, r",9, VAL1,", r",10,val, ", rrx") \ + TEST_R( op s" r0, r",11,VAL1,", #0x00010001") \ + TEST_R( op s" r11, r",0, VAL1,", #0xf5000000") \ + TEST_R( op s" r7, r",8, VAL2,", #0x000af000") + +#define DATA_PROCESSING32_DNM(op,val) \ + _DATA_PROCESSING32_DNM(op,"",val) \ + _DATA_PROCESSING32_DNM(op,"s",val) + +#define DATA_PROCESSING32_NM(op,val) \ + TEST_RR(op".w r",1, VAL1,", r",2, val, "") \ + TEST_RR(op" r",1, VAL1,", r",2, val, ", lsl #3") \ + TEST_RR(op" r",3, VAL1,", r",2, val, ", lsr #4") \ + TEST_RR(op" r",3, VAL1,", r",2, val, ", asr #5") \ + TEST_RR(op" r",5, VAL1,", r",2, N(val),", asr #6") \ + TEST_RR(op" r",5, VAL1,", r",2, val, ", ror #7") \ + TEST_RR(op" r",9, VAL1,", r",10,val, ", rrx") \ + TEST_R( op" r",11,VAL1,", #0x00010001") \ + TEST_R( op" r",0, VAL1,", #0xf5000000") \ + TEST_R( op" r",8, VAL2,", #0x000af000") + +#define _DATA_PROCESSING32_DM(op,s,val) \ + TEST_R( op s".w r0, r",14, val, "") \ + TEST_R( op s" r1, r",12, val, ", lsl #3") \ + TEST_R( op s" r2, r",11, val, ", lsr #4") \ + TEST_R( op s" r3, r",10, val, ", asr #5") \ + TEST_R( op s" r4, r",9, N(val),", asr #6") \ + TEST_R( op s" r5, r",8, val, ", ror #7") \ + TEST_R( op s" r8, r",7,val, ", rrx") \ + TEST( op s" r0, #0x00010001") \ + TEST( op s" r11, #0xf5000000") \ + TEST( op s" r7, #0x000af000") \ + TEST( op s" r4, #0x00005a00") + +#define DATA_PROCESSING32_DM(op,val) \ + _DATA_PROCESSING32_DM(op,"",val) \ + _DATA_PROCESSING32_DM(op,"s",val) + + DATA_PROCESSING32_DNM("and",0xf00f00ff) + DATA_PROCESSING32_NM("tst",0xf00f00ff) + DATA_PROCESSING32_DNM("bic",0xf00f00ff) + DATA_PROCESSING32_DNM("orr",0xf00f00ff) + DATA_PROCESSING32_DM("mov",VAL2) + DATA_PROCESSING32_DNM("orn",0xf00f00ff) + DATA_PROCESSING32_DM("mvn",VAL2) + DATA_PROCESSING32_DNM("eor",0xf00f00ff) + DATA_PROCESSING32_NM("teq",0xf00f00ff) + DATA_PROCESSING32_DNM("add",VAL2) + DATA_PROCESSING32_NM("cmn",VAL2) + DATA_PROCESSING32_DNM("adc",VAL2) + DATA_PROCESSING32_DNM("sbc",VAL2) + DATA_PROCESSING32_DNM("sub",VAL2) + DATA_PROCESSING32_NM("cmp",VAL2) + DATA_PROCESSING32_DNM("rsb",VAL2) + + TEST_RR("pkhbt r0, r",0, HH1,", r",1, HH2,"") + TEST_RR("pkhbt r14,r",12, HH1,", r",10,HH2,", lsl #2") + TEST_RR("pkhtb r0, r",0, HH1,", r",1, HH2,"") + TEST_RR("pkhtb r14,r",12, HH1,", r",10,HH2,", asr #2") + + TEST_UNSUPPORTED(".short 0xea17,0x0f0d @ tst.w r7, sp") + TEST_UNSUPPORTED(".short 0xea17,0x0f0f @ tst.w r7, pc") + TEST_UNSUPPORTED(".short 0xea1d,0x0f07 @ tst.w sp, r7") + TEST_UNSUPPORTED(".short 0xea1f,0x0f07 @ tst.w pc, r7") + TEST_UNSUPPORTED(".short 0xf01d,0x1f08 @ tst sp, #0x00080008") + TEST_UNSUPPORTED(".short 0xf01f,0x1f08 @ tst pc, #0x00080008") + + TEST_UNSUPPORTED(".short 0xea97,0x0f0d @ teq.w r7, sp") + TEST_UNSUPPORTED(".short 0xea97,0x0f0f @ teq.w r7, pc") + TEST_UNSUPPORTED(".short 0xea9d,0x0f07 @ teq.w sp, r7") + TEST_UNSUPPORTED(".short 0xea9f,0x0f07 @ teq.w pc, r7") + TEST_UNSUPPORTED(".short 0xf09d,0x1f08 @ tst sp, #0x00080008") + TEST_UNSUPPORTED(".short 0xf09f,0x1f08 @ tst pc, #0x00080008") + + TEST_UNSUPPORTED(".short 0xeb17,0x0f0d @ cmn.w r7, sp") + TEST_UNSUPPORTED(".short 0xeb17,0x0f0f @ cmn.w r7, pc") + TEST_P("cmn.w sp, r",7,0,"") + TEST_UNSUPPORTED(".short 0xeb1f,0x0f07 @ cmn.w pc, r7") + TEST( "cmn sp, #0x00080008") + TEST_UNSUPPORTED(".short 0xf11f,0x1f08 @ cmn pc, #0x00080008") + + TEST_UNSUPPORTED(".short 0xebb7,0x0f0d @ cmp.w r7, sp") + TEST_UNSUPPORTED(".short 0xebb7,0x0f0f @ cmp.w r7, pc") + TEST_P("cmp.w sp, r",7,0,"") + TEST_UNSUPPORTED(".short 0xebbf,0x0f07 @ cmp.w pc, r7") + TEST( "cmp sp, #0x00080008") + TEST_UNSUPPORTED(".short 0xf1bf,0x1f08 @ cmp pc, #0x00080008") + + TEST_UNSUPPORTED(".short 0xea5f,0x070d @ movs.w r7, sp") + TEST_UNSUPPORTED(".short 0xea5f,0x070f @ movs.w r7, pc") + TEST_UNSUPPORTED(".short 0xea5f,0x0d07 @ movs.w sp, r7") + TEST_UNSUPPORTED(".short 0xea4f,0x0f07 @ mov.w pc, r7") + TEST_UNSUPPORTED(".short 0xf04f,0x1d08 @ mov sp, #0x00080008") + TEST_UNSUPPORTED(".short 0xf04f,0x1f08 @ mov pc, #0x00080008") + + TEST_R("add.w r0, sp, r",1, 4,"") + TEST_R("adds r0, sp, r",1, 4,", asl #3") + TEST_R("add r0, sp, r",1, 4,", asl #4") + TEST_R("add r0, sp, r",1, 16,", ror #1") + TEST_R("add.w sp, sp, r",1, 4,"") + TEST_R("add sp, sp, r",1, 4,", asl #3") + TEST_UNSUPPORTED(".short 0xeb0d,0x1d01 @ add sp, sp, r1, asl #4") + TEST_UNSUPPORTED(".short 0xeb0d,0x0d71 @ add sp, sp, r1, ror #1") + TEST( "add.w r0, sp, #24") + TEST( "add.w sp, sp, #24") + TEST_UNSUPPORTED(".short 0xeb0d,0x0f01 @ add pc, sp, r1") + TEST_UNSUPPORTED(".short 0xeb0d,0x000f @ add r0, sp, pc") + TEST_UNSUPPORTED(".short 0xeb0d,0x000d @ add r0, sp, sp") + TEST_UNSUPPORTED(".short 0xeb0d,0x0d0f @ add sp, sp, pc") + TEST_UNSUPPORTED(".short 0xeb0d,0x0d0d @ add sp, sp, sp") + + TEST_R("sub.w r0, sp, r",1, 4,"") + TEST_R("subs r0, sp, r",1, 4,", asl #3") + TEST_R("sub r0, sp, r",1, 4,", asl #4") + TEST_R("sub r0, sp, r",1, 16,", ror #1") + TEST_R("sub.w sp, sp, r",1, 4,"") + TEST_R("sub sp, sp, r",1, 4,", asl #3") + TEST_UNSUPPORTED(".short 0xebad,0x1d01 @ sub sp, sp, r1, asl #4") + TEST_UNSUPPORTED(".short 0xebad,0x0d71 @ sub sp, sp, r1, ror #1") + TEST_UNSUPPORTED(".short 0xebad,0x0f01 @ sub pc, sp, r1") + TEST( "sub.w r0, sp, #24") + TEST( "sub.w sp, sp, #24") + + TEST_UNSUPPORTED(".short 0xea02,0x010f @ and r1, r2, pc") + TEST_UNSUPPORTED(".short 0xea0f,0x0103 @ and r1, pc, r3") + TEST_UNSUPPORTED(".short 0xea02,0x0f03 @ and pc, r2, r3") + TEST_UNSUPPORTED(".short 0xea02,0x010d @ and r1, r2, sp") + TEST_UNSUPPORTED(".short 0xea0d,0x0103 @ and r1, sp, r3") + TEST_UNSUPPORTED(".short 0xea02,0x0d03 @ and sp, r2, r3") + TEST_UNSUPPORTED(".short 0xf00d,0x1108 @ and r1, sp, #0x00080008") + TEST_UNSUPPORTED(".short 0xf00f,0x1108 @ and r1, pc, #0x00080008") + TEST_UNSUPPORTED(".short 0xf002,0x1d08 @ and sp, r8, #0x00080008") + TEST_UNSUPPORTED(".short 0xf002,0x1f08 @ and pc, r8, #0x00080008") + + TEST_UNSUPPORTED(".short 0xeb02,0x010f @ add r1, r2, pc") + TEST_UNSUPPORTED(".short 0xeb0f,0x0103 @ add r1, pc, r3") + TEST_UNSUPPORTED(".short 0xeb02,0x0f03 @ add pc, r2, r3") + TEST_UNSUPPORTED(".short 0xeb02,0x010d @ add r1, r2, sp") + TEST_SUPPORTED( ".short 0xeb0d,0x0103 @ add r1, sp, r3") + TEST_UNSUPPORTED(".short 0xeb02,0x0d03 @ add sp, r2, r3") + TEST_SUPPORTED( ".short 0xf10d,0x1108 @ add r1, sp, #0x00080008") + TEST_UNSUPPORTED(".short 0xf10d,0x1f08 @ add pc, sp, #0x00080008") + TEST_UNSUPPORTED(".short 0xf10f,0x1108 @ add r1, pc, #0x00080008") + TEST_UNSUPPORTED(".short 0xf102,0x1d08 @ add sp, r8, #0x00080008") + TEST_UNSUPPORTED(".short 0xf102,0x1f08 @ add pc, r8, #0x00080008") + + TEST_UNSUPPORTED(".short 0xeaa0,0x0000") + TEST_UNSUPPORTED(".short 0xeaf0,0x0000") + TEST_UNSUPPORTED(".short 0xeb20,0x0000") + TEST_UNSUPPORTED(".short 0xeb80,0x0000") + TEST_UNSUPPORTED(".short 0xebe0,0x0000") + + TEST_UNSUPPORTED(".short 0xf0a0,0x0000") + TEST_UNSUPPORTED(".short 0xf0c0,0x0000") + TEST_UNSUPPORTED(".short 0xf0f0,0x0000") + TEST_UNSUPPORTED(".short 0xf120,0x0000") + TEST_UNSUPPORTED(".short 0xf180,0x0000") + TEST_UNSUPPORTED(".short 0xf1e0,0x0000") + + TEST_GROUP("Coprocessor instructions") + + TEST_UNSUPPORTED(".short 0xec00,0x0000") + TEST_UNSUPPORTED(".short 0xeff0,0x0000") + TEST_UNSUPPORTED(".short 0xfc00,0x0000") + TEST_UNSUPPORTED(".short 0xfff0,0x0000") + + TEST_GROUP("Data-processing (plain binary immediate)") + + TEST_R("addw r0, r",1, VAL1,", #0x123") + TEST( "addw r14, sp, #0xf5a") + TEST( "addw sp, sp, #0x20") + TEST( "addw r7, pc, #0x888") + TEST_UNSUPPORTED(".short 0xf20f,0x1f20 @ addw pc, pc, #0x120") + TEST_UNSUPPORTED(".short 0xf20d,0x1f20 @ addw pc, sp, #0x120") + TEST_UNSUPPORTED(".short 0xf20f,0x1d20 @ addw sp, pc, #0x120") + TEST_UNSUPPORTED(".short 0xf200,0x1d20 @ addw sp, r0, #0x120") + + TEST_R("subw r0, r",1, VAL1,", #0x123") + TEST( "subw r14, sp, #0xf5a") + TEST( "subw sp, sp, #0x20") + TEST( "subw r7, pc, #0x888") + TEST_UNSUPPORTED(".short 0xf2af,0x1f20 @ subw pc, pc, #0x120") + TEST_UNSUPPORTED(".short 0xf2ad,0x1f20 @ subw pc, sp, #0x120") + TEST_UNSUPPORTED(".short 0xf2af,0x1d20 @ subw sp, pc, #0x120") + TEST_UNSUPPORTED(".short 0xf2a0,0x1d20 @ subw sp, r0, #0x120") + + TEST("movw r0, #0") + TEST("movw r0, #0xffff") + TEST("movw lr, #0xffff") + TEST_UNSUPPORTED(".short 0xf240,0x0d00 @ movw sp, #0") + TEST_UNSUPPORTED(".short 0xf240,0x0f00 @ movw pc, #0") + + TEST_R("movt r",0, VAL1,", #0") + TEST_R("movt r",0, VAL2,", #0xffff") + TEST_R("movt r",14,VAL1,", #0xffff") + TEST_UNSUPPORTED(".short 0xf2c0,0x0d00 @ movt sp, #0") + TEST_UNSUPPORTED(".short 0xf2c0,0x0f00 @ movt pc, #0") + + TEST_R( "ssat r0, #24, r",0, VAL1,"") + TEST_R( "ssat r14, #24, r",12, VAL2,"") + TEST_R( "ssat r0, #24, r",0, VAL1,", lsl #8") + TEST_R( "ssat r14, #24, r",12, VAL2,", asr #8") + TEST_UNSUPPORTED(".short 0xf30c,0x0d17 @ ssat sp, #24, r12") + TEST_UNSUPPORTED(".short 0xf30c,0x0f17 @ ssat pc, #24, r12") + TEST_UNSUPPORTED(".short 0xf30d,0x0c17 @ ssat r12, #24, sp") + TEST_UNSUPPORTED(".short 0xf30f,0x0c17 @ ssat r12, #24, pc") + + TEST_R( "usat r0, #24, r",0, VAL1,"") + TEST_R( "usat r14, #24, r",12, VAL2,"") + TEST_R( "usat r0, #24, r",0, VAL1,", lsl #8") + TEST_R( "usat r14, #24, r",12, VAL2,", asr #8") + TEST_UNSUPPORTED(".short 0xf38c,0x0d17 @ usat sp, #24, r12") + TEST_UNSUPPORTED(".short 0xf38c,0x0f17 @ usat pc, #24, r12") + TEST_UNSUPPORTED(".short 0xf38d,0x0c17 @ usat r12, #24, sp") + TEST_UNSUPPORTED(".short 0xf38f,0x0c17 @ usat r12, #24, pc") + + TEST_R( "ssat16 r0, #12, r",0, HH1,"") + TEST_R( "ssat16 r14, #12, r",12, HH2,"") + TEST_UNSUPPORTED(".short 0xf32c,0x0d0b @ ssat16 sp, #12, r12") + TEST_UNSUPPORTED(".short 0xf32c,0x0f0b @ ssat16 pc, #12, r12") + TEST_UNSUPPORTED(".short 0xf32d,0x0c0b @ ssat16 r12, #12, sp") + TEST_UNSUPPORTED(".short 0xf32f,0x0c0b @ ssat16 r12, #12, pc") + + TEST_R( "usat16 r0, #12, r",0, HH1,"") + TEST_R( "usat16 r14, #12, r",12, HH2,"") + TEST_UNSUPPORTED(".short 0xf3ac,0x0d0b @ usat16 sp, #12, r12") + TEST_UNSUPPORTED(".short 0xf3ac,0x0f0b @ usat16 pc, #12, r12") + TEST_UNSUPPORTED(".short 0xf3ad,0x0c0b @ usat16 r12, #12, sp") + TEST_UNSUPPORTED(".short 0xf3af,0x0c0b @ usat16 r12, #12, pc") + + TEST_R( "sbfx r0, r",0 , VAL1,", #0, #31") + TEST_R( "sbfx r14, r",12, VAL2,", #8, #16") + TEST_R( "sbfx r4, r",10, VAL1,", #16, #15") + TEST_UNSUPPORTED(".short 0xf34c,0x2d0f @ sbfx sp, r12, #8, #16") + TEST_UNSUPPORTED(".short 0xf34c,0x2f0f @ sbfx pc, r12, #8, #16") + TEST_UNSUPPORTED(".short 0xf34d,0x2c0f @ sbfx r12, sp, #8, #16") + TEST_UNSUPPORTED(".short 0xf34f,0x2c0f @ sbfx r12, pc, #8, #16") + + TEST_R( "ubfx r0, r",0 , VAL1,", #0, #31") + TEST_R( "ubfx r14, r",12, VAL2,", #8, #16") + TEST_R( "ubfx r4, r",10, VAL1,", #16, #15") + TEST_UNSUPPORTED(".short 0xf3cc,0x2d0f @ ubfx sp, r12, #8, #16") + TEST_UNSUPPORTED(".short 0xf3cc,0x2f0f @ ubfx pc, r12, #8, #16") + TEST_UNSUPPORTED(".short 0xf3cd,0x2c0f @ ubfx r12, sp, #8, #16") + TEST_UNSUPPORTED(".short 0xf3cf,0x2c0f @ ubfx r12, pc, #8, #16") + + TEST_R( "bfc r",0, VAL1,", #4, #20") + TEST_R( "bfc r",14,VAL2,", #4, #20") + TEST_R( "bfc r",7, VAL1,", #0, #31") + TEST_R( "bfc r",8, VAL2,", #0, #31") + TEST_UNSUPPORTED(".short 0xf36f,0x0d1e @ bfc sp, #0, #31") + TEST_UNSUPPORTED(".short 0xf36f,0x0f1e @ bfc pc, #0, #31") + + TEST_RR( "bfi r",0, VAL1,", r",0 , VAL2,", #0, #31") + TEST_RR( "bfi r",12,VAL1,", r",14 , VAL2,", #4, #20") + TEST_UNSUPPORTED(".short 0xf36e,0x1d17 @ bfi sp, r14, #4, #20") + TEST_UNSUPPORTED(".short 0xf36e,0x1f17 @ bfi pc, r14, #4, #20") + TEST_UNSUPPORTED(".short 0xf36d,0x1e17 @ bfi r14, sp, #4, #20") + + TEST_GROUP("Branches and miscellaneous control") + +CONDITION_INSTRUCTIONS(22, + TEST_BF("beq.w 2f") + TEST_BB("bne.w 2b") + TEST_BF("bgt.w 2f") + TEST_BB("blt.w 2b") + TEST_BF_X("bpl.w 2f",0x1000) +) + + TEST_UNSUPPORTED("msr cpsr, r0") + TEST_UNSUPPORTED("msr cpsr_f, r1") + TEST_UNSUPPORTED("msr spsr, r2") + + TEST_UNSUPPORTED("cpsie.w i") + TEST_UNSUPPORTED("cpsid.w i") + TEST_UNSUPPORTED("cps 0x13") + + TEST_SUPPORTED("yield.w") + TEST("sev.w") + TEST("nop.w") + TEST("wfi.w") + TEST_SUPPORTED("wfe.w") + TEST_UNSUPPORTED("dbg.w #0") + + TEST_UNSUPPORTED("clrex") + TEST_UNSUPPORTED("dsb") + TEST_UNSUPPORTED("dmb") + TEST_UNSUPPORTED("isb") + + TEST_UNSUPPORTED("bxj r0") + + TEST_UNSUPPORTED("subs pc, lr, #4") + + TEST("mrs r0, cpsr") + TEST("mrs r14, cpsr") + TEST_UNSUPPORTED(".short 0xf3ef,0x8d00 @ mrs sp, spsr") + TEST_UNSUPPORTED(".short 0xf3ef,0x8f00 @ mrs pc, spsr") + TEST_UNSUPPORTED("mrs r0, spsr") + TEST_UNSUPPORTED("mrs lr, spsr") + + TEST_UNSUPPORTED(".short 0xf7f0,0x8000 @ smc #0") + + TEST_UNSUPPORTED(".short 0xf7f0,0xa000 @ undefeined") + + TEST_BF( "b.w 2f") + TEST_BB( "b.w 2b") + TEST_BF_X("b.w 2f", 0x1000) + + TEST_BF( "bl.w 2f") + TEST_BB( "bl.w 2b") + TEST_BB_X("bl.w 2b", 0x1000) + + TEST_X( "blx __dummy_arm_subroutine", + ".arm \n\t" + ".align \n\t" + ".type __dummy_arm_subroutine, %%function \n\t" + "__dummy_arm_subroutine: \n\t" + "mov r0, pc \n\t" + "bx lr \n\t" + ".thumb \n\t" + ) + TEST( "blx __dummy_arm_subroutine") + + TEST_GROUP("Store single data item") + +#define SINGLE_STORE(size) \ + TEST_RP( "str"size" r",0, VAL1,", [r",11,-1024,", #1024]") \ + TEST_RP( "str"size" r",14,VAL2,", [r",1, -1024,", #1080]") \ + TEST_RP( "str"size" r",0, VAL1,", [r",11,256, ", #-120]") \ + TEST_RP( "str"size" r",14,VAL2,", [r",1, 256, ", #-128]") \ + TEST_RP( "str"size" r",0, VAL1,", [r",11,24, "], #120") \ + TEST_RP( "str"size" r",14,VAL2,", [r",1, 24, "], #128") \ + TEST_RP( "str"size" r",0, VAL1,", [r",11,24, "], #-120") \ + TEST_RP( "str"size" r",14,VAL2,", [r",1, 24, "], #-128") \ + TEST_RP( "str"size" r",0, VAL1,", [r",11,24, ", #120]!") \ + TEST_RP( "str"size" r",14,VAL2,", [r",1, 24, ", #128]!") \ + TEST_RP( "str"size" r",0, VAL1,", [r",11,256, ", #-120]!") \ + TEST_RP( "str"size" r",14,VAL2,", [r",1, 256, ", #-128]!") \ + TEST_RPR("str"size".w r",0, VAL1,", [r",1, 0,", r",2, 4,"]") \ + TEST_RPR("str"size" r",14,VAL2,", [r",10,0,", r",11,4,", lsl #1]") \ + TEST_R( "str"size".w r",7, VAL1,", [sp, #24]") \ + TEST_RP( "str"size".w r",0, VAL2,", [r",0,0, "]") \ + TEST_UNSUPPORTED("str"size"t r0, [r1, #4]") + + SINGLE_STORE("b") + SINGLE_STORE("h") + SINGLE_STORE("") + + TEST("str sp, [sp]") + TEST_UNSUPPORTED(".short 0xf8cf,0xe000 @ str r14, [pc]") + TEST_UNSUPPORTED(".short 0xf8ce,0xf000 @ str pc, [r14]") + + TEST_GROUP("Advanced SIMD element or structure load/store instructions") + + TEST_UNSUPPORTED(".short 0xf900,0x0000") + TEST_UNSUPPORTED(".short 0xf92f,0xffff") + TEST_UNSUPPORTED(".short 0xf980,0x0000") + TEST_UNSUPPORTED(".short 0xf9ef,0xffff") + + TEST_GROUP("Load single data item and memory hints") + +#define SINGLE_LOAD(size) \ + TEST_P( "ldr"size" r0, [r",11,-1024, ", #1024]") \ + TEST_P( "ldr"size" r14, [r",1, -1024,", #1080]") \ + TEST_P( "ldr"size" r0, [r",11,256, ", #-120]") \ + TEST_P( "ldr"size" r14, [r",1, 256, ", #-128]") \ + TEST_P( "ldr"size" r0, [r",11,24, "], #120") \ + TEST_P( "ldr"size" r14, [r",1, 24, "], #128") \ + TEST_P( "ldr"size" r0, [r",11,24, "], #-120") \ + TEST_P( "ldr"size" r14, [r",1,24, "], #-128") \ + TEST_P( "ldr"size" r0, [r",11,24, ", #120]!") \ + TEST_P( "ldr"size" r14, [r",1, 24, ", #128]!") \ + TEST_P( "ldr"size" r0, [r",11,256, ", #-120]!") \ + TEST_P( "ldr"size" r14, [r",1, 256, ", #-128]!") \ + TEST_PR("ldr"size".w r0, [r",1, 0,", r",2, 4,"]") \ + TEST_PR("ldr"size" r14, [r",10,0,", r",11,4,", lsl #1]") \ + TEST_X( "ldr"size".w r0, 3f", \ + ".align 3 \n\t" \ + "3: .word "__stringify(VAL1)) \ + TEST_X( "ldr"size".w r14, 3f", \ + ".align 3 \n\t" \ + "3: .word "__stringify(VAL2)) \ + TEST( "ldr"size".w r7, 3b") \ + TEST( "ldr"size".w r7, [sp, #24]") \ + TEST_P( "ldr"size".w r0, [r",0,0, "]") \ + TEST_UNSUPPORTED("ldr"size"t r0, [r1, #4]") + + SINGLE_LOAD("b") + SINGLE_LOAD("sb") + SINGLE_LOAD("h") + SINGLE_LOAD("sh") + SINGLE_LOAD("") + + TEST_BF_P("ldr pc, [r",14, 15*4,"]") + TEST_P( "ldr sp, [r",14, 13*4,"]") + TEST_BF_R("ldr pc, [sp, r",14, 15*4,"]") + TEST_R( "ldr sp, [sp, r",14, 13*4,"]") + TEST_THUMB_TO_ARM_INTERWORK_P("ldr pc, [r",0,0,", #15*4]") + TEST_SUPPORTED("ldr sp, 99f") + TEST_SUPPORTED("ldr pc, 99f") + + TEST_UNSUPPORTED(".short 0xf854,0x700d @ ldr r7, [r4, sp]") + TEST_UNSUPPORTED(".short 0xf854,0x700f @ ldr r7, [r4, pc]") + TEST_UNSUPPORTED(".short 0xf814,0x700d @ ldrb r7, [r4, sp]") + TEST_UNSUPPORTED(".short 0xf814,0x700f @ ldrb r7, [r4, pc]") + TEST_UNSUPPORTED(".short 0xf89f,0xd004 @ ldrb sp, 99f") + TEST_UNSUPPORTED(".short 0xf814,0xd008 @ ldrb sp, [r4, r8]") + TEST_UNSUPPORTED(".short 0xf894,0xd000 @ ldrb sp, [r4]") + + TEST_UNSUPPORTED(".short 0xf860,0x0000") /* Unallocated space */ + TEST_UNSUPPORTED(".short 0xf9ff,0xffff") /* Unallocated space */ + TEST_UNSUPPORTED(".short 0xf950,0x0000") /* Unallocated space */ + TEST_UNSUPPORTED(".short 0xf95f,0xffff") /* Unallocated space */ + TEST_UNSUPPORTED(".short 0xf800,0x0800") /* Unallocated space */ + TEST_UNSUPPORTED(".short 0xf97f,0xfaff") /* Unallocated space */ + + TEST( "pli [pc, #4]") + TEST( "pli [pc, #-4]") + TEST( "pld [pc, #4]") + TEST( "pld [pc, #-4]") + + TEST_P( "pld [r",0,-1024,", #1024]") + TEST( ".short 0xf8b0,0xf400 @ pldw [r0, #1024]") + TEST_P( "pli [r",4, 0b,", #1024]") + TEST_P( "pld [r",7, 120,", #-120]") + TEST( ".short 0xf837,0xfc78 @ pldw [r7, #-120]") + TEST_P( "pli [r",11,120,", #-120]") + TEST( "pld [sp, #0]") + + TEST_PR("pld [r",7, 24, ", r",0, 16,"]") + TEST_PR("pld [r",8, 24, ", r",12,16,", lsl #3]") + TEST_SUPPORTED(".short 0xf837,0xf000 @ pldw [r7, r0]") + TEST_SUPPORTED(".short 0xf838,0xf03c @ pldw [r8, r12, lsl #3]"); + TEST_RR("pli [r",12,0b,", r",0, 16,"]") + TEST_RR("pli [r",0, 0b,", r",12,16,", lsl #3]") + TEST_R( "pld [sp, r",1, 16,"]") + TEST_UNSUPPORTED(".short 0xf817,0xf00d @pld [r7, sp]") + TEST_UNSUPPORTED(".short 0xf817,0xf00f @pld [r7, pc]") + + TEST_GROUP("Data-processing (register)") + +#define SHIFTS32(op) \ + TEST_RR(op" r0, r",1, VAL1,", r",2, 3, "") \ + TEST_RR(op" r14, r",12,VAL2,", r",11,10,"") + + SHIFTS32("lsl") + SHIFTS32("lsls") + SHIFTS32("lsr") + SHIFTS32("lsrs") + SHIFTS32("asr") + SHIFTS32("asrs") + SHIFTS32("ror") + SHIFTS32("rors") + + TEST_UNSUPPORTED(".short 0xfa01,0xff02 @ lsl pc, r1, r2") + TEST_UNSUPPORTED(".short 0xfa01,0xfd02 @ lsl sp, r1, r2") + TEST_UNSUPPORTED(".short 0xfa0f,0xf002 @ lsl r0, pc, r2") + TEST_UNSUPPORTED(".short 0xfa0d,0xf002 @ lsl r0, sp, r2") + TEST_UNSUPPORTED(".short 0xfa01,0xf00f @ lsl r0, r1, pc") + TEST_UNSUPPORTED(".short 0xfa01,0xf00d @ lsl r0, r1, sp") + + TEST_RR( "sxtah r0, r",0, HH1,", r",1, HH2,"") + TEST_RR( "sxtah r14,r",12, HH2,", r",10,HH1,", ror #8") + TEST_R( "sxth r8, r",7, HH1,"") + + TEST_UNSUPPORTED(".short 0xfa0f,0xff87 @ sxth pc, r7"); + TEST_UNSUPPORTED(".short 0xfa0f,0xfd87 @ sxth sp, r7"); + TEST_UNSUPPORTED(".short 0xfa0f,0xf88f @ sxth r8, pc"); + TEST_UNSUPPORTED(".short 0xfa0f,0xf88d @ sxth r8, sp"); + + TEST_RR( "uxtah r0, r",0, HH1,", r",1, HH2,"") + TEST_RR( "uxtah r14,r",12, HH2,", r",10,HH1,", ror #8") + TEST_R( "uxth r8, r",7, HH1,"") + + TEST_RR( "sxtab16 r0, r",0, HH1,", r",1, HH2,"") + TEST_RR( "sxtab16 r14,r",12, HH2,", r",10,HH1,", ror #8") + TEST_R( "sxtb16 r8, r",7, HH1,"") + + TEST_RR( "uxtab16 r0, r",0, HH1,", r",1, HH2,"") + TEST_RR( "uxtab16 r14,r",12, HH2,", r",10,HH1,", ror #8") + TEST_R( "uxtb16 r8, r",7, HH1,"") + + TEST_RR( "sxtab r0, r",0, HH1,", r",1, HH2,"") + TEST_RR( "sxtab r14,r",12, HH2,", r",10,HH1,", ror #8") + TEST_R( "sxtb r8, r",7, HH1,"") + + TEST_RR( "uxtab r0, r",0, HH1,", r",1, HH2,"") + TEST_RR( "uxtab r14,r",12, HH2,", r",10,HH1,", ror #8") + TEST_R( "uxtb r8, r",7, HH1,"") + + TEST_UNSUPPORTED(".short 0xfa60,0x00f0") + TEST_UNSUPPORTED(".short 0xfa7f,0xffff") + +#define PARALLEL_ADD_SUB(op) \ + TEST_RR( op"add16 r0, r",0, HH1,", r",1, HH2,"") \ + TEST_RR( op"add16 r14, r",12,HH2,", r",10,HH1,"") \ + TEST_RR( op"asx r0, r",0, HH1,", r",1, HH2,"") \ + TEST_RR( op"asx r14, r",12,HH2,", r",10,HH1,"") \ + TEST_RR( op"sax r0, r",0, HH1,", r",1, HH2,"") \ + TEST_RR( op"sax r14, r",12,HH2,", r",10,HH1,"") \ + TEST_RR( op"sub16 r0, r",0, HH1,", r",1, HH2,"") \ + TEST_RR( op"sub16 r14, r",12,HH2,", r",10,HH1,"") \ + TEST_RR( op"add8 r0, r",0, HH1,", r",1, HH2,"") \ + TEST_RR( op"add8 r14, r",12,HH2,", r",10,HH1,"") \ + TEST_RR( op"sub8 r0, r",0, HH1,", r",1, HH2,"") \ + TEST_RR( op"sub8 r14, r",12,HH2,", r",10,HH1,"") + + TEST_GROUP("Parallel addition and subtraction, signed") + + PARALLEL_ADD_SUB("s") + PARALLEL_ADD_SUB("q") + PARALLEL_ADD_SUB("sh") + + TEST_GROUP("Parallel addition and subtraction, unsigned") + + PARALLEL_ADD_SUB("u") + PARALLEL_ADD_SUB("uq") + PARALLEL_ADD_SUB("uh") + + TEST_GROUP("Miscellaneous operations") + + TEST_RR("qadd r0, r",1, VAL1,", r",2, VAL2,"") + TEST_RR("qadd lr, r",9, VAL2,", r",8, VAL1,"") + TEST_RR("qsub r0, r",1, VAL1,", r",2, VAL2,"") + TEST_RR("qsub lr, r",9, VAL2,", r",8, VAL1,"") + TEST_RR("qdadd r0, r",1, VAL1,", r",2, VAL2,"") + TEST_RR("qdadd lr, r",9, VAL2,", r",8, VAL1,"") + TEST_RR("qdsub r0, r",1, VAL1,", r",2, VAL2,"") + TEST_RR("qdsub lr, r",9, VAL2,", r",8, VAL1,"") + + TEST_R("rev.w r0, r",0, VAL1,"") + TEST_R("rev r14, r",12, VAL2,"") + TEST_R("rev16.w r0, r",0, VAL1,"") + TEST_R("rev16 r14, r",12, VAL2,"") + TEST_R("rbit r0, r",0, VAL1,"") + TEST_R("rbit r14, r",12, VAL2,"") + TEST_R("revsh.w r0, r",0, VAL1,"") + TEST_R("revsh r14, r",12, VAL2,"") + + TEST_UNSUPPORTED(".short 0xfa9c,0xff8c @ rev pc, r12"); + TEST_UNSUPPORTED(".short 0xfa9c,0xfd8c @ rev sp, r12"); + TEST_UNSUPPORTED(".short 0xfa9f,0xfe8f @ rev r14, pc"); + TEST_UNSUPPORTED(".short 0xfa9d,0xfe8d @ rev r14, sp"); + + TEST_RR("sel r0, r",0, VAL1,", r",1, VAL2,"") + TEST_RR("sel r14, r",12,VAL1,", r",10, VAL2,"") + + TEST_R("clz r0, r",0, 0x0,"") + TEST_R("clz r7, r",14,0x1,"") + TEST_R("clz lr, r",7, 0xffffffff,"") + + TEST_UNSUPPORTED(".short 0xfa80,0xf030") /* Unallocated space */ + TEST_UNSUPPORTED(".short 0xfaff,0xff7f") /* Unallocated space */ + TEST_UNSUPPORTED(".short 0xfab0,0xf000") /* Unallocated space */ + TEST_UNSUPPORTED(".short 0xfaff,0xff7f") /* Unallocated space */ + + TEST_GROUP("Multiply, multiply accumulate, and absolute difference operations") + + TEST_RR( "mul r0, r",1, VAL1,", r",2, VAL2,"") + TEST_RR( "mul r7, r",8, VAL2,", r",9, VAL2,"") + TEST_UNSUPPORTED(".short 0xfb08,0xff09 @ mul pc, r8, r9") + TEST_UNSUPPORTED(".short 0xfb08,0xfd09 @ mul sp, r8, r9") + TEST_UNSUPPORTED(".short 0xfb0f,0xf709 @ mul r7, pc, r9") + TEST_UNSUPPORTED(".short 0xfb0d,0xf709 @ mul r7, sp, r9") + TEST_UNSUPPORTED(".short 0xfb08,0xf70f @ mul r7, r8, pc") + TEST_UNSUPPORTED(".short 0xfb08,0xf70d @ mul r7, r8, sp") + + TEST_RRR( "mla r0, r",1, VAL1,", r",2, VAL2,", r",3, VAL3,"") + TEST_RRR( "mla r7, r",8, VAL3,", r",9, VAL1,", r",10, VAL2,"") + TEST_UNSUPPORTED(".short 0xfb08,0xaf09 @ mla pc, r8, r9, r10"); + TEST_UNSUPPORTED(".short 0xfb08,0xad09 @ mla sp, r8, r9, r10"); + TEST_UNSUPPORTED(".short 0xfb0f,0xa709 @ mla r7, pc, r9, r10"); + TEST_UNSUPPORTED(".short 0xfb0d,0xa709 @ mla r7, sp, r9, r10"); + TEST_UNSUPPORTED(".short 0xfb08,0xa70f @ mla r7, r8, pc, r10"); + TEST_UNSUPPORTED(".short 0xfb08,0xa70d @ mla r7, r8, sp, r10"); + TEST_UNSUPPORTED(".short 0xfb08,0xd709 @ mla r7, r8, r9, sp"); + + TEST_RRR( "mls r0, r",1, VAL1,", r",2, VAL2,", r",3, VAL3,"") + TEST_RRR( "mls r7, r",8, VAL3,", r",9, VAL1,", r",10, VAL2,"") + + TEST_RRR( "smlabb r0, r",1, VAL1,", r",2, VAL2,", r",3, VAL3,"") + TEST_RRR( "smlabb r7, r",8, VAL3,", r",9, VAL1,", r",10, VAL2,"") + TEST_RRR( "smlatb r0, r",1, VAL1,", r",2, VAL2,", r",3, VAL3,"") + TEST_RRR( "smlatb r7, r",8, VAL3,", r",9, VAL1,", r",10, VAL2,"") + TEST_RRR( "smlabt r0, r",1, VAL1,", r",2, VAL2,", r",3, VAL3,"") + TEST_RRR( "smlabt r7, r",8, VAL3,", r",9, VAL1,", r",10, VAL2,"") + TEST_RRR( "smlatt r0, r",1, VAL1,", r",2, VAL2,", r",3, VAL3,"") + TEST_RRR( "smlatt r7, r",8, VAL3,", r",9, VAL1,", r",10, VAL2,"") + TEST_RR( "smulbb r0, r",1, VAL1,", r",2, VAL2,"") + TEST_RR( "smulbb r7, r",8, VAL3,", r",9, VAL1,"") + TEST_RR( "smultb r0, r",1, VAL1,", r",2, VAL2,"") + TEST_RR( "smultb r7, r",8, VAL3,", r",9, VAL1,"") + TEST_RR( "smulbt r0, r",1, VAL1,", r",2, VAL2,"") + TEST_RR( "smulbt r7, r",8, VAL3,", r",9, VAL1,"") + TEST_RR( "smultt r0, r",1, VAL1,", r",2, VAL2,"") + TEST_RR( "smultt r7, r",8, VAL3,", r",9, VAL1,"") + + TEST_RRR( "smlad r0, r",0, HH1,", r",1, HH2,", r",2, VAL1,"") + TEST_RRR( "smlad r14, r",12,HH2,", r",10,HH1,", r",8, VAL2,"") + TEST_RRR( "smladx r0, r",0, HH1,", r",1, HH2,", r",2, VAL1,"") + TEST_RRR( "smladx r14, r",12,HH2,", r",10,HH1,", r",8, VAL2,"") + TEST_RR( "smuad r0, r",0, HH1,", r",1, HH2,"") + TEST_RR( "smuad r14, r",12,HH2,", r",10,HH1,"") + TEST_RR( "smuadx r0, r",0, HH1,", r",1, HH2,"") + TEST_RR( "smuadx r14, r",12,HH2,", r",10,HH1,"") + + TEST_RRR( "smlawb r0, r",1, VAL1,", r",2, VAL2,", r",3, VAL3,"") + TEST_RRR( "smlawb r7, r",8, VAL3,", r",9, VAL1,", r",10, VAL2,"") + TEST_RRR( "smlawt r0, r",1, VAL1,", r",2, VAL2,", r",3, VAL3,"") + TEST_RRR( "smlawt r7, r",8, VAL3,", r",9, VAL1,", r",10, VAL2,"") + TEST_RR( "smulwb r0, r",1, VAL1,", r",2, VAL2,"") + TEST_RR( "smulwb r7, r",8, VAL3,", r",9, VAL1,"") + TEST_RR( "smulwt r0, r",1, VAL1,", r",2, VAL2,"") + TEST_RR( "smulwt r7, r",8, VAL3,", r",9, VAL1,"") + + TEST_RRR( "smlsd r0, r",0, HH1,", r",1, HH2,", r",2, VAL1,"") + TEST_RRR( "smlsd r14, r",12,HH2,", r",10,HH1,", r",8, VAL2,"") + TEST_RRR( "smlsdx r0, r",0, HH1,", r",1, HH2,", r",2, VAL1,"") + TEST_RRR( "smlsdx r14, r",12,HH2,", r",10,HH1,", r",8, VAL2,"") + TEST_RR( "smusd r0, r",0, HH1,", r",1, HH2,"") + TEST_RR( "smusd r14, r",12,HH2,", r",10,HH1,"") + TEST_RR( "smusdx r0, r",0, HH1,", r",1, HH2,"") + TEST_RR( "smusdx r14, r",12,HH2,", r",10,HH1,"") + + TEST_RRR( "smmla r0, r",0, VAL1,", r",1, VAL2,", r",2, VAL1,"") + TEST_RRR( "smmla r14, r",12,VAL2,", r",10,VAL1,", r",8, VAL2,"") + TEST_RRR( "smmlar r0, r",0, VAL1,", r",1, VAL2,", r",2, VAL1,"") + TEST_RRR( "smmlar r14, r",12,VAL2,", r",10,VAL1,", r",8, VAL2,"") + TEST_RR( "smmul r0, r",0, VAL1,", r",1, VAL2,"") + TEST_RR( "smmul r14, r",12,VAL2,", r",10,VAL1,"") + TEST_RR( "smmulr r0, r",0, VAL1,", r",1, VAL2,"") + TEST_RR( "smmulr r14, r",12,VAL2,", r",10,VAL1,"") + + TEST_RRR( "smmls r0, r",0, VAL1,", r",1, VAL2,", r",2, VAL1,"") + TEST_RRR( "smmls r14, r",12,VAL2,", r",10,VAL1,", r",8, VAL2,"") + TEST_RRR( "smmlsr r0, r",0, VAL1,", r",1, VAL2,", r",2, VAL1,"") + TEST_RRR( "smmlsr r14, r",12,VAL2,", r",10,VAL1,", r",8, VAL2,"") + + TEST_RRR( "usada8 r0, r",0, VAL1,", r",1, VAL2,", r",2, VAL3,"") + TEST_RRR( "usada8 r14, r",12,VAL2,", r",10,VAL1,", r",8, VAL3,"") + TEST_RR( "usad8 r0, r",0, VAL1,", r",1, VAL2,"") + TEST_RR( "usad8 r14, r",12,VAL2,", r",10,VAL1,"") + + TEST_UNSUPPORTED(".short 0xfb00,0xf010") /* Unallocated space */ + TEST_UNSUPPORTED(".short 0xfb0f,0xff1f") /* Unallocated space */ + TEST_UNSUPPORTED(".short 0xfb70,0xf010") /* Unallocated space */ + TEST_UNSUPPORTED(".short 0xfb7f,0xff1f") /* Unallocated space */ + TEST_UNSUPPORTED(".short 0xfb70,0x0010") /* Unallocated space */ + TEST_UNSUPPORTED(".short 0xfb7f,0xff1f") /* Unallocated space */ + + TEST_GROUP("Long multiply, long multiply accumulate, and divide") + + TEST_RR( "smull r0, r1, r",2, VAL1,", r",3, VAL2,"") + TEST_RR( "smull r7, r8, r",9, VAL2,", r",10, VAL1,"") + TEST_UNSUPPORTED(".short 0xfb89,0xf80a @ smull pc, r8, r9, r10"); + TEST_UNSUPPORTED(".short 0xfb89,0xd80a @ smull sp, r8, r9, r10"); + TEST_UNSUPPORTED(".short 0xfb89,0x7f0a @ smull r7, pc, r9, r10"); + TEST_UNSUPPORTED(".short 0xfb89,0x7d0a @ smull r7, sp, r9, r10"); + TEST_UNSUPPORTED(".short 0xfb8f,0x780a @ smull r7, r8, pc, r10"); + TEST_UNSUPPORTED(".short 0xfb8d,0x780a @ smull r7, r8, sp, r10"); + TEST_UNSUPPORTED(".short 0xfb89,0x780f @ smull r7, r8, r9, pc"); + TEST_UNSUPPORTED(".short 0xfb89,0x780d @ smull r7, r8, r9, sp"); + + TEST_RR( "umull r0, r1, r",2, VAL1,", r",3, VAL2,"") + TEST_RR( "umull r7, r8, r",9, VAL2,", r",10, VAL1,"") + + TEST_RRRR( "smlal r",0, VAL1,", r",1, VAL2,", r",2, VAL3,", r",3, VAL4) + TEST_RRRR( "smlal r",8, VAL4,", r",9, VAL1,", r",10,VAL2,", r",11,VAL3) + + TEST_RRRR( "smlalbb r",0, VAL1,", r",1, VAL2,", r",2, VAL3,", r",3, VAL4) + TEST_RRRR( "smlalbb r",8, VAL4,", r",9, VAL1,", r",10,VAL2,", r",11,VAL3) + TEST_RRRR( "smlalbt r",0, VAL1,", r",1, VAL2,", r",2, VAL3,", r",3, VAL4) + TEST_RRRR( "smlalbt r",8, VAL4,", r",9, VAL1,", r",10,VAL2,", r",11,VAL3) + TEST_RRRR( "smlaltb r",0, VAL1,", r",1, VAL2,", r",2, VAL3,", r",3, VAL4) + TEST_RRRR( "smlaltb r",8, VAL4,", r",9, VAL1,", r",10,VAL2,", r",11,VAL3) + TEST_RRRR( "smlaltt r",0, VAL1,", r",1, VAL2,", r",2, VAL3,", r",3, VAL4) + TEST_RRRR( "smlaltt r",8, VAL4,", r",9, VAL1,", r",10,VAL2,", r",11,VAL3) + + TEST_RRRR( "smlald r",0, VAL1,", r",1, VAL2, ", r",0, HH1,", r",1, HH2) + TEST_RRRR( "smlald r",11,VAL2,", r",10,VAL1, ", r",9, HH2,", r",8, HH1) + TEST_RRRR( "smlaldx r",0, VAL1,", r",1, VAL2, ", r",0, HH1,", r",1, HH2) + TEST_RRRR( "smlaldx r",11,VAL2,", r",10,VAL1, ", r",9, HH2,", r",8, HH1) + + TEST_RRRR( "smlsld r",0, VAL1,", r",1, VAL2, ", r",0, HH1,", r",1, HH2) + TEST_RRRR( "smlsld r",11,VAL2,", r",10,VAL1, ", r",9, HH2,", r",8, HH1) + TEST_RRRR( "smlsldx r",0, VAL1,", r",1, VAL2, ", r",0, HH1,", r",1, HH2) + TEST_RRRR( "smlsldx r",11,VAL2,", r",10,VAL1, ", r",9, HH2,", r",8, HH1) + + TEST_RRRR( "umlal r",0, VAL1,", r",1, VAL2,", r",2, VAL3,", r",3, VAL4) + TEST_RRRR( "umlal r",8, VAL4,", r",9, VAL1,", r",10,VAL2,", r",11,VAL3) + TEST_RRRR( "umaal r",0, VAL1,", r",1, VAL2,", r",2, VAL3,", r",3, VAL4) + TEST_RRRR( "umaal r",8, VAL4,", r",9, VAL1,", r",10,VAL2,", r",11,VAL3) + + TEST_GROUP("Coprocessor instructions") + + TEST_UNSUPPORTED(".short 0xfc00,0x0000") + TEST_UNSUPPORTED(".short 0xffff,0xffff") + + TEST_GROUP("Testing instructions in IT blocks") + + TEST_ITBLOCK("sub.w r0, r0") + + verbose("\n"); +} + diff --git a/arch/arm/kernel/kprobes-test.c b/arch/arm/kernel/kprobes-test.c new file mode 100644 index 000000000000..e17cdd6d90d8 --- /dev/null +++ b/arch/arm/kernel/kprobes-test.c @@ -0,0 +1,1748 @@ +/* + * arch/arm/kernel/kprobes-test.c + * + * Copyright (C) 2011 Jon Medhurst . + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License version 2 as + * published by the Free Software Foundation. + */ + +/* + * This file contains test code for ARM kprobes. + * + * The top level function run_all_tests() executes tests for all of the + * supported instruction sets: ARM, 16-bit Thumb, and 32-bit Thumb. These tests + * fall into two categories; run_api_tests() checks basic functionality of the + * kprobes API, and run_test_cases() is a comprehensive test for kprobes + * instruction decoding and simulation. + * + * run_test_cases() first checks the kprobes decoding table for self consistency + * (using table_test()) then executes a series of test cases for each of the CPU + * instruction forms. coverage_start() and coverage_end() are used to verify + * that these test cases cover all of the possible combinations of instructions + * described by the kprobes decoding tables. + * + * The individual test cases are in kprobes-test-arm.c and kprobes-test-thumb.c + * which use the macros defined in kprobes-test.h. The rest of this + * documentation will describe the operation of the framework used by these + * test cases. + */ + +/* + * TESTING METHODOLOGY + * ------------------- + * + * The methodology used to test an ARM instruction 'test_insn' is to use + * inline assembler like: + * + * test_before: nop + * test_case: test_insn + * test_after: nop + * + * When the test case is run a kprobe is placed of each nop. The + * post-handler of the test_before probe is used to modify the saved CPU + * register context to that which we require for the test case. The + * pre-handler of the of the test_after probe saves a copy of the CPU + * register context. In this way we can execute test_insn with a specific + * register context and see the results afterwards. + * + * To actually test the kprobes instruction emulation we perform the above + * step a second time but with an additional kprobe on the test_case + * instruction itself. If the emulation is accurate then the results seen + * by the test_after probe will be identical to the first run which didn't + * have a probe on test_case. + * + * Each test case is run several times with a variety of variations in the + * flags value of stored in CPSR, and for Thumb code, different ITState. + * + * For instructions which can modify PC, a second test_after probe is used + * like this: + * + * test_before: nop + * test_case: test_insn + * test_after: nop + * b test_done + * test_after2: nop + * test_done: + * + * The test case is constructed such that test_insn branches to + * test_after2, or, if testing a conditional instruction, it may just + * continue to test_after. The probes inserted at both locations let us + * determine which happened. A similar approach is used for testing + * backwards branches... + * + * b test_before + * b test_done @ helps to cope with off by 1 branches + * test_after2: nop + * b test_done + * test_before: nop + * test_case: test_insn + * test_after: nop + * test_done: + * + * The macros used to generate the assembler instructions describe above + * are TEST_INSTRUCTION, TEST_BRANCH_F (branch forwards) and TEST_BRANCH_B + * (branch backwards). In these, the local variables numbered 1, 50, 2 and + * 99 represent: test_before, test_case, test_after2 and test_done. + * + * FRAMEWORK + * --------- + * + * Each test case is wrapped between the pair of macros TESTCASE_START and + * TESTCASE_END. As well as performing the inline assembler boilerplate, + * these call out to the kprobes_test_case_start() and + * kprobes_test_case_end() functions which drive the execution of the test + * case. The specific arguments to use for each test case are stored as + * inline data constructed using the various TEST_ARG_* macros. Putting + * this all together, a simple test case may look like: + * + * TESTCASE_START("Testing mov r0, r7") + * TEST_ARG_REG(7, 0x12345678) // Set r7=0x12345678 + * TEST_ARG_END("") + * TEST_INSTRUCTION("mov r0, r7") + * TESTCASE_END + * + * Note, in practice the single convenience macro TEST_R would be used for this + * instead. + * + * The above would expand to assembler looking something like: + * + * @ TESTCASE_START + * bl __kprobes_test_case_start + * @ start of inline data... + * .ascii "mov r0, r7" @ text title for test case + * .byte 0 + * .align 2 + * + * @ TEST_ARG_REG + * .byte ARG_TYPE_REG + * .byte 7 + * .short 0 + * .word 0x1234567 + * + * @ TEST_ARG_END + * .byte ARG_TYPE_END + * .byte TEST_ISA @ flags, including ISA being tested + * .short 50f-0f @ offset of 'test_before' + * .short 2f-0f @ offset of 'test_after2' (if relevent) + * .short 99f-0f @ offset of 'test_done' + * @ start of test case code... + * 0: + * .code TEST_ISA @ switch to ISA being tested + * + * @ TEST_INSTRUCTION + * 50: nop @ location for 'test_before' probe + * 1: mov r0, r7 @ the test case instruction 'test_insn' + * nop @ location for 'test_after' probe + * + * // TESTCASE_END + * 2: + * 99: bl __kprobes_test_case_end_##TEST_ISA + * .code NONMAL_ISA + * + * When the above is execute the following happens... + * + * __kprobes_test_case_start() is an assembler wrapper which sets up space + * for a stack buffer and calls the C function kprobes_test_case_start(). + * This C function will do some initial processing of the inline data and + * setup some global state. It then inserts the test_before and test_after + * kprobes and returns a value which causes the assembler wrapper to jump + * to the start of the test case code, (local label '0'). + * + * When the test case code executes, the test_before probe will be hit and + * test_before_post_handler will call setup_test_context(). This fills the + * stack buffer and CPU registers with a test pattern and then processes + * the test case arguments. In our example there is one TEST_ARG_REG which + * indicates that R7 should be loaded with the value 0x12345678. + * + * When the test_before probe ends, the test case continues and executes + * the "mov r0, r7" instruction. It then hits the test_after probe and the + * pre-handler for this (test_after_pre_handler) will save a copy of the + * CPU register context. This should now have R0 holding the same value as + * R7. + * + * Finally we get to the call to __kprobes_test_case_end_{32,16}. This is + * an assembler wrapper which switches back to the ISA used by the test + * code and calls the C function kprobes_test_case_end(). + * + * For each run through the test case, test_case_run_count is incremented + * by one. For even runs, kprobes_test_case_end() saves a copy of the + * register and stack buffer contents from the test case just run. It then + * inserts a kprobe on the test case instruction 'test_insn' and returns a + * value to cause the test case code to be re-run. + * + * For odd numbered runs, kprobes_test_case_end() compares the register and + * stack buffer contents to those that were saved on the previous even + * numbered run (the one without the kprobe on test_insn). These should be + * the same if the kprobe instruction simulation routine is correct. + * + * The pair of test case runs is repeated with different combinations of + * flag values in CPSR and, for Thumb, different ITState. This is + * controlled by test_context_cpsr(). + * + * BUILDING TEST CASES + * ------------------- + * + * + * As an aid to building test cases, the stack buffer is initialised with + * some special values: + * + * [SP+13*4] Contains SP+120. This can be used to test instructions + * which load a value into SP. + * + * [SP+15*4] When testing branching instructions using TEST_BRANCH_{F,B}, + * this holds the target address of the branch, 'test_after2'. + * This can be used to test instructions which load a PC value + * from memory. + */ + +#include +#include +#include +#include + +#include "kprobes.h" +#include "kprobes-test.h" + + +#define BENCHMARKING 1 + + +/* + * Test basic API + */ + +static bool test_regs_ok; +static int test_func_instance; +static int pre_handler_called; +static int post_handler_called; +static int jprobe_func_called; +static int kretprobe_handler_called; + +#define FUNC_ARG1 0x12345678 +#define FUNC_ARG2 0xabcdef + + +#ifndef CONFIG_THUMB2_KERNEL + +long arm_func(long r0, long r1); + +static void __used __naked __arm_kprobes_test_func(void) +{ + __asm__ __volatile__ ( + ".arm \n\t" + ".type arm_func, %%function \n\t" + "arm_func: \n\t" + "adds r0, r0, r1 \n\t" + "bx lr \n\t" + ".code "NORMAL_ISA /* Back to Thumb if necessary */ + : : : "r0", "r1", "cc" + ); +} + +#else /* CONFIG_THUMB2_KERNEL */ + +long thumb16_func(long r0, long r1); +long thumb32even_func(long r0, long r1); +long thumb32odd_func(long r0, long r1); + +static void __used __naked __thumb_kprobes_test_funcs(void) +{ + __asm__ __volatile__ ( + ".type thumb16_func, %%function \n\t" + "thumb16_func: \n\t" + "adds.n r0, r0, r1 \n\t" + "bx lr \n\t" + + ".align \n\t" + ".type thumb32even_func, %%function \n\t" + "thumb32even_func: \n\t" + "adds.w r0, r0, r1 \n\t" + "bx lr \n\t" + + ".align \n\t" + "nop.n \n\t" + ".type thumb32odd_func, %%function \n\t" + "thumb32odd_func: \n\t" + "adds.w r0, r0, r1 \n\t" + "bx lr \n\t" + + : : : "r0", "r1", "cc" + ); +} + +#endif /* CONFIG_THUMB2_KERNEL */ + + +static int call_test_func(long (*func)(long, long), bool check_test_regs) +{ + long ret; + + ++test_func_instance; + test_regs_ok = false; + + ret = (*func)(FUNC_ARG1, FUNC_ARG2); + if (ret != FUNC_ARG1 + FUNC_ARG2) { + pr_err("FAIL: call_test_func: func returned %lx\n", ret); + return false; + } + + if (check_test_regs && !test_regs_ok) { + pr_err("FAIL: test regs not OK\n"); + return false; + } + + return true; +} + +static int __kprobes pre_handler(struct kprobe *p, struct pt_regs *regs) +{ + pre_handler_called = test_func_instance; + if (regs->ARM_r0 == FUNC_ARG1 && regs->ARM_r1 == FUNC_ARG2) + test_regs_ok = true; + return 0; +} + +static void __kprobes post_handler(struct kprobe *p, struct pt_regs *regs, + unsigned long flags) +{ + post_handler_called = test_func_instance; + if (regs->ARM_r0 != FUNC_ARG1 + FUNC_ARG2 || regs->ARM_r1 != FUNC_ARG2) + test_regs_ok = false; +} + +static struct kprobe the_kprobe = { + .addr = 0, + .pre_handler = pre_handler, + .post_handler = post_handler +}; + +static int test_kprobe(long (*func)(long, long)) +{ + int ret; + + the_kprobe.addr = (kprobe_opcode_t *)func; + ret = register_kprobe(&the_kprobe); + if (ret < 0) { + pr_err("FAIL: register_kprobe failed with %d\n", ret); + return ret; + } + + ret = call_test_func(func, true); + + unregister_kprobe(&the_kprobe); + the_kprobe.flags = 0; /* Clear disable flag to allow reuse */ + + if (!ret) + return -EINVAL; + if (pre_handler_called != test_func_instance) { + pr_err("FAIL: kprobe pre_handler not called\n"); + return -EINVAL; + } + if (post_handler_called != test_func_instance) { + pr_err("FAIL: kprobe post_handler not called\n"); + return -EINVAL; + } + if (!call_test_func(func, false)) + return -EINVAL; + if (pre_handler_called == test_func_instance || + post_handler_called == test_func_instance) { + pr_err("FAIL: probe called after unregistering\n"); + return -EINVAL; + } + + return 0; +} + +static void __kprobes jprobe_func(long r0, long r1) +{ + jprobe_func_called = test_func_instance; + if (r0 == FUNC_ARG1 && r1 == FUNC_ARG2) + test_regs_ok = true; + jprobe_return(); +} + +static struct jprobe the_jprobe = { + .entry = jprobe_func, +}; + +static int test_jprobe(long (*func)(long, long)) +{ + int ret; + + the_jprobe.kp.addr = (kprobe_opcode_t *)func; + ret = register_jprobe(&the_jprobe); + if (ret < 0) { + pr_err("FAIL: register_jprobe failed with %d\n", ret); + return ret; + } + + ret = call_test_func(func, true); + + unregister_jprobe(&the_jprobe); + the_jprobe.kp.flags = 0; /* Clear disable flag to allow reuse */ + + if (!ret) + return -EINVAL; + if (jprobe_func_called != test_func_instance) { + pr_err("FAIL: jprobe handler function not called\n"); + return -EINVAL; + } + if (!call_test_func(func, false)) + return -EINVAL; + if (jprobe_func_called == test_func_instance) { + pr_err("FAIL: probe called after unregistering\n"); + return -EINVAL; + } + + return 0; +} + +static int __kprobes +kretprobe_handler(struct kretprobe_instance *ri, struct pt_regs *regs) +{ + kretprobe_handler_called = test_func_instance; + if (regs_return_value(regs) == FUNC_ARG1 + FUNC_ARG2) + test_regs_ok = true; + return 0; +} + +static struct kretprobe the_kretprobe = { + .handler = kretprobe_handler, +}; + +static int test_kretprobe(long (*func)(long, long)) +{ + int ret; + + the_kretprobe.kp.addr = (kprobe_opcode_t *)func; + ret = register_kretprobe(&the_kretprobe); + if (ret < 0) { + pr_err("FAIL: register_kretprobe failed with %d\n", ret); + return ret; + } + + ret = call_test_func(func, true); + + unregister_kretprobe(&the_kretprobe); + the_kretprobe.kp.flags = 0; /* Clear disable flag to allow reuse */ + + if (!ret) + return -EINVAL; + if (kretprobe_handler_called != test_func_instance) { + pr_err("FAIL: kretprobe handler not called\n"); + return -EINVAL; + } + if (!call_test_func(func, false)) + return -EINVAL; + if (jprobe_func_called == test_func_instance) { + pr_err("FAIL: kretprobe called after unregistering\n"); + return -EINVAL; + } + + return 0; +} + +static int run_api_tests(long (*func)(long, long)) +{ + int ret; + + pr_info(" kprobe\n"); + ret = test_kprobe(func); + if (ret < 0) + return ret; + + pr_info(" jprobe\n"); + ret = test_jprobe(func); + if (ret < 0) + return ret; + + pr_info(" kretprobe\n"); + ret = test_kretprobe(func); + if (ret < 0) + return ret; + + return 0; +} + + +/* + * Benchmarking + */ + +#if BENCHMARKING + +static void __naked benchmark_nop(void) +{ + __asm__ __volatile__ ( + "nop \n\t" + "bx lr" + ); +} + +#ifdef CONFIG_THUMB2_KERNEL +#define wide ".w" +#else +#define wide +#endif + +static void __naked benchmark_pushpop1(void) +{ + __asm__ __volatile__ ( + "stmdb"wide" sp!, {r3-r11,lr} \n\t" + "ldmia"wide" sp!, {r3-r11,pc}" + ); +} + +static void __naked benchmark_pushpop2(void) +{ + __asm__ __volatile__ ( + "stmdb"wide" sp!, {r0-r8,lr} \n\t" + "ldmia"wide" sp!, {r0-r8,pc}" + ); +} + +static void __naked benchmark_pushpop3(void) +{ + __asm__ __volatile__ ( + "stmdb"wide" sp!, {r4,lr} \n\t" + "ldmia"wide" sp!, {r4,pc}" + ); +} + +static void __naked benchmark_pushpop4(void) +{ + __asm__ __volatile__ ( + "stmdb"wide" sp!, {r0,lr} \n\t" + "ldmia"wide" sp!, {r0,pc}" + ); +} + + +#ifdef CONFIG_THUMB2_KERNEL + +static void __naked benchmark_pushpop_thumb(void) +{ + __asm__ __volatile__ ( + "push.n {r0-r7,lr} \n\t" + "pop.n {r0-r7,pc}" + ); +} + +#endif + +static int __kprobes +benchmark_pre_handler(struct kprobe *p, struct pt_regs *regs) +{ + return 0; +} + +static int benchmark(void(*fn)(void)) +{ + unsigned n, i, t, t0; + + for (n = 1000; ; n *= 2) { + t0 = sched_clock(); + for (i = n; i > 0; --i) + fn(); + t = sched_clock() - t0; + if (t >= 250000000) + break; /* Stop once we took more than 0.25 seconds */ + } + return t / n; /* Time for one iteration in nanoseconds */ +}; + +static int kprobe_benchmark(void(*fn)(void), unsigned offset) +{ + struct kprobe k = { + .addr = (kprobe_opcode_t *)((uintptr_t)fn + offset), + .pre_handler = benchmark_pre_handler, + }; + + int ret = register_kprobe(&k); + if (ret < 0) { + pr_err("FAIL: register_kprobe failed with %d\n", ret); + return ret; + } + + ret = benchmark(fn); + + unregister_kprobe(&k); + return ret; +}; + +struct benchmarks { + void (*fn)(void); + unsigned offset; + const char *title; +}; + +static int run_benchmarks(void) +{ + int ret; + struct benchmarks list[] = { + {&benchmark_nop, 0, "nop"}, + /* + * benchmark_pushpop{1,3} will have the optimised + * instruction emulation, whilst benchmark_pushpop{2,4} will + * be the equivalent unoptimised instructions. + */ + {&benchmark_pushpop1, 0, "stmdb sp!, {r3-r11,lr}"}, + {&benchmark_pushpop1, 4, "ldmia sp!, {r3-r11,pc}"}, + {&benchmark_pushpop2, 0, "stmdb sp!, {r0-r8,lr}"}, + {&benchmark_pushpop2, 4, "ldmia sp!, {r0-r8,pc}"}, + {&benchmark_pushpop3, 0, "stmdb sp!, {r4,lr}"}, + {&benchmark_pushpop3, 4, "ldmia sp!, {r4,pc}"}, + {&benchmark_pushpop4, 0, "stmdb sp!, {r0,lr}"}, + {&benchmark_pushpop4, 4, "ldmia sp!, {r0,pc}"}, +#ifdef CONFIG_THUMB2_KERNEL + {&benchmark_pushpop_thumb, 0, "push.n {r0-r7,lr}"}, + {&benchmark_pushpop_thumb, 2, "pop.n {r0-r7,pc}"}, +#endif + {0} + }; + + struct benchmarks *b; + for (b = list; b->fn; ++b) { + ret = kprobe_benchmark(b->fn, b->offset); + if (ret < 0) + return ret; + pr_info(" %dns for kprobe %s\n", ret, b->title); + } + + pr_info("\n"); + return 0; +} + +#endif /* BENCHMARKING */ + + +/* + * Decoding table self-consistency tests + */ + +static const int decode_struct_sizes[NUM_DECODE_TYPES] = { + [DECODE_TYPE_TABLE] = sizeof(struct decode_table), + [DECODE_TYPE_CUSTOM] = sizeof(struct decode_custom), + [DECODE_TYPE_SIMULATE] = sizeof(struct decode_simulate), + [DECODE_TYPE_EMULATE] = sizeof(struct decode_emulate), + [DECODE_TYPE_OR] = sizeof(struct decode_or), + [DECODE_TYPE_REJECT] = sizeof(struct decode_reject) +}; + +static int table_iter(const union decode_item *table, + int (*fn)(const struct decode_header *, void *), + void *args) +{ + const struct decode_header *h = (struct decode_header *)table; + int result; + + for (;;) { + enum decode_type type = h->type_regs.bits & DECODE_TYPE_MASK; + + if (type == DECODE_TYPE_END) + return 0; + + result = fn(h, args); + if (result) + return result; + + h = (struct decode_header *) + ((uintptr_t)h + decode_struct_sizes[type]); + + } +} + +static int table_test_fail(const struct decode_header *h, const char* message) +{ + + pr_err("FAIL: kprobes test failure \"%s\" (mask %08x, value %08x)\n", + message, h->mask.bits, h->value.bits); + return -EINVAL; +} + +struct table_test_args { + const union decode_item *root_table; + u32 parent_mask; + u32 parent_value; +}; + +static int table_test_fn(const struct decode_header *h, void *args) +{ + struct table_test_args *a = (struct table_test_args *)args; + enum decode_type type = h->type_regs.bits & DECODE_TYPE_MASK; + + if (h->value.bits & ~h->mask.bits) + return table_test_fail(h, "Match value has bits not in mask"); + + if ((h->mask.bits & a->parent_mask) != a->parent_mask) + return table_test_fail(h, "Mask has bits not in parent mask"); + + if ((h->value.bits ^ a->parent_value) & a->parent_mask) + return table_test_fail(h, "Value is inconsistent with parent"); + + if (type == DECODE_TYPE_TABLE) { + struct decode_table *d = (struct decode_table *)h; + struct table_test_args args2 = *a; + args2.parent_mask = h->mask.bits; + args2.parent_value = h->value.bits; + return table_iter(d->table.table, table_test_fn, &args2); + } + + return 0; +} + +static int table_test(const union decode_item *table) +{ + struct table_test_args args = { + .root_table = table, + .parent_mask = 0, + .parent_value = 0 + }; + return table_iter(args.root_table, table_test_fn, &args); +} + + +/* + * Decoding table test coverage analysis + * + * coverage_start() builds a coverage_table which contains a list of + * coverage_entry's to match each entry in the specified kprobes instruction + * decoding table. + * + * When test cases are run, coverage_add() is called to process each case. + * This looks up the corresponding entry in the coverage_table and sets it as + * being matched, as well as clearing the regs flag appropriate for the test. + * + * After all test cases have been run, coverage_end() is called to check that + * all entries in coverage_table have been matched and that all regs flags are + * cleared. I.e. that all possible combinations of instructions described by + * the kprobes decoding tables have had a test case executed for them. + */ + +bool coverage_fail; + +#define MAX_COVERAGE_ENTRIES 256 + +struct coverage_entry { + const struct decode_header *header; + unsigned regs; + unsigned nesting; + char matched; +}; + +struct coverage_table { + struct coverage_entry *base; + unsigned num_entries; + unsigned nesting; +}; + +struct coverage_table coverage; + +#define COVERAGE_ANY_REG (1<<0) +#define COVERAGE_SP (1<<1) +#define COVERAGE_PC (1<<2) +#define COVERAGE_PCWB (1<<3) + +static const char coverage_register_lookup[16] = { + [REG_TYPE_ANY] = COVERAGE_ANY_REG | COVERAGE_SP | COVERAGE_PC, + [REG_TYPE_SAMEAS16] = COVERAGE_ANY_REG, + [REG_TYPE_SP] = COVERAGE_SP, + [REG_TYPE_PC] = COVERAGE_PC, + [REG_TYPE_NOSP] = COVERAGE_ANY_REG | COVERAGE_SP, + [REG_TYPE_NOSPPC] = COVERAGE_ANY_REG | COVERAGE_SP | COVERAGE_PC, + [REG_TYPE_NOPC] = COVERAGE_ANY_REG | COVERAGE_PC, + [REG_TYPE_NOPCWB] = COVERAGE_ANY_REG | COVERAGE_PC | COVERAGE_PCWB, + [REG_TYPE_NOPCX] = COVERAGE_ANY_REG, + [REG_TYPE_NOSPPCX] = COVERAGE_ANY_REG | COVERAGE_SP, +}; + +unsigned coverage_start_registers(const struct decode_header *h) +{ + unsigned regs = 0; + int i; + for (i = 0; i < 20; i += 4) { + int r = (h->type_regs.bits >> (DECODE_TYPE_BITS + i)) & 0xf; + regs |= coverage_register_lookup[r] << i; + } + return regs; +} + +static int coverage_start_fn(const struct decode_header *h, void *args) +{ + struct coverage_table *coverage = (struct coverage_table *)args; + enum decode_type type = h->type_regs.bits & DECODE_TYPE_MASK; + struct coverage_entry *entry = coverage->base + coverage->num_entries; + + if (coverage->num_entries == MAX_COVERAGE_ENTRIES - 1) { + pr_err("FAIL: Out of space for test coverage data"); + return -ENOMEM; + } + + ++coverage->num_entries; + + entry->header = h; + entry->regs = coverage_start_registers(h); + entry->nesting = coverage->nesting; + entry->matched = false; + + if (type == DECODE_TYPE_TABLE) { + struct decode_table *d = (struct decode_table *)h; + int ret; + ++coverage->nesting; + ret = table_iter(d->table.table, coverage_start_fn, coverage); + --coverage->nesting; + return ret; + } + + return 0; +} + +static int coverage_start(const union decode_item *table) +{ + coverage.base = kmalloc(MAX_COVERAGE_ENTRIES * + sizeof(struct coverage_entry), GFP_KERNEL); + coverage.num_entries = 0; + coverage.nesting = 0; + return table_iter(table, coverage_start_fn, &coverage); +} + +static void +coverage_add_registers(struct coverage_entry *entry, kprobe_opcode_t insn) +{ + int regs = entry->header->type_regs.bits >> DECODE_TYPE_BITS; + int i; + for (i = 0; i < 20; i += 4) { + enum decode_reg_type reg_type = (regs >> i) & 0xf; + int reg = (insn >> i) & 0xf; + int flag; + + if (!reg_type) + continue; + + if (reg == 13) + flag = COVERAGE_SP; + else if (reg == 15) + flag = COVERAGE_PC; + else + flag = COVERAGE_ANY_REG; + entry->regs &= ~(flag << i); + + switch (reg_type) { + + case REG_TYPE_NONE: + case REG_TYPE_ANY: + case REG_TYPE_SAMEAS16: + break; + + case REG_TYPE_SP: + if (reg != 13) + return; + break; + + case REG_TYPE_PC: + if (reg != 15) + return; + break; + + case REG_TYPE_NOSP: + if (reg == 13) + return; + break; + + case REG_TYPE_NOSPPC: + case REG_TYPE_NOSPPCX: + if (reg == 13 || reg == 15) + return; + break; + + case REG_TYPE_NOPCWB: + if (!is_writeback(insn)) + break; + if (reg == 15) { + entry->regs &= ~(COVERAGE_PCWB << i); + return; + } + break; + + case REG_TYPE_NOPC: + case REG_TYPE_NOPCX: + if (reg == 15) + return; + break; + } + + } +} + +static void coverage_add(kprobe_opcode_t insn) +{ + struct coverage_entry *entry = coverage.base; + struct coverage_entry *end = coverage.base + coverage.num_entries; + bool matched = false; + unsigned nesting = 0; + + for (; entry < end; ++entry) { + const struct decode_header *h = entry->header; + enum decode_type type = h->type_regs.bits & DECODE_TYPE_MASK; + + if (entry->nesting > nesting) + continue; /* Skip sub-table we didn't match */ + + if (entry->nesting < nesting) + break; /* End of sub-table we were scanning */ + + if (!matched) { + if ((insn & h->mask.bits) != h->value.bits) + continue; + entry->matched = true; + } + + switch (type) { + + case DECODE_TYPE_TABLE: + ++nesting; + break; + + case DECODE_TYPE_CUSTOM: + case DECODE_TYPE_SIMULATE: + case DECODE_TYPE_EMULATE: + coverage_add_registers(entry, insn); + return; + + case DECODE_TYPE_OR: + matched = true; + break; + + case DECODE_TYPE_REJECT: + default: + return; + } + + } +} + +static void coverage_end(void) +{ + struct coverage_entry *entry = coverage.base; + struct coverage_entry *end = coverage.base + coverage.num_entries; + + for (; entry < end; ++entry) { + u32 mask = entry->header->mask.bits; + u32 value = entry->header->value.bits; + + if (entry->regs) { + pr_err("FAIL: Register test coverage missing for %08x %08x (%05x)\n", + mask, value, entry->regs); + coverage_fail = true; + } + if (!entry->matched) { + pr_err("FAIL: Test coverage entry missing for %08x %08x\n", + mask, value); + coverage_fail = true; + } + } + + kfree(coverage.base); +} + + +/* + * Framework for instruction set test cases + */ + +void __naked __kprobes_test_case_start(void) +{ + __asm__ __volatile__ ( + "stmdb sp!, {r4-r11} \n\t" + "sub sp, sp, #"__stringify(TEST_MEMORY_SIZE)"\n\t" + "bic r0, lr, #1 @ r0 = inline title string \n\t" + "mov r1, sp \n\t" + "bl kprobes_test_case_start \n\t" + "bx r0 \n\t" + ); +} + +#ifndef CONFIG_THUMB2_KERNEL + +void __naked __kprobes_test_case_end_32(void) +{ + __asm__ __volatile__ ( + "mov r4, lr \n\t" + "bl kprobes_test_case_end \n\t" + "cmp r0, #0 \n\t" + "movne pc, r0 \n\t" + "mov r0, r4 \n\t" + "add sp, sp, #"__stringify(TEST_MEMORY_SIZE)"\n\t" + "ldmia sp!, {r4-r11} \n\t" + "mov pc, r0 \n\t" + ); +} + +#else /* CONFIG_THUMB2_KERNEL */ + +void __naked __kprobes_test_case_end_16(void) +{ + __asm__ __volatile__ ( + "mov r4, lr \n\t" + "bl kprobes_test_case_end \n\t" + "cmp r0, #0 \n\t" + "bxne r0 \n\t" + "mov r0, r4 \n\t" + "add sp, sp, #"__stringify(TEST_MEMORY_SIZE)"\n\t" + "ldmia sp!, {r4-r11} \n\t" + "bx r0 \n\t" + ); +} + +void __naked __kprobes_test_case_end_32(void) +{ + __asm__ __volatile__ ( + ".arm \n\t" + "orr lr, lr, #1 @ will return to Thumb code \n\t" + "ldr pc, 1f \n\t" + "1: \n\t" + ".word __kprobes_test_case_end_16 \n\t" + ); +} + +#endif + + +int kprobe_test_flags; +int kprobe_test_cc_position; + +static int test_try_count; +static int test_pass_count; +static int test_fail_count; + +static struct pt_regs initial_regs; +static struct pt_regs expected_regs; +static struct pt_regs result_regs; + +static u32 expected_memory[TEST_MEMORY_SIZE/sizeof(u32)]; + +static const char *current_title; +static struct test_arg *current_args; +static u32 *current_stack; +static uintptr_t current_branch_target; + +static uintptr_t current_code_start; +static kprobe_opcode_t current_instruction; + + +#define TEST_CASE_PASSED -1 +#define TEST_CASE_FAILED -2 + +static int test_case_run_count; +static bool test_case_is_thumb; +static int test_instance; + +/* + * We ignore the state of the imprecise abort disable flag (CPSR.A) because this + * can change randomly as the kernel doesn't take care to preserve or initialise + * this across context switches. Also, with Security Extentions, the flag may + * not be under control of the kernel; for this reason we ignore the state of + * the FIQ disable flag CPSR.F as well. + */ +#define PSR_IGNORE_BITS (PSR_A_BIT | PSR_F_BIT) + +static unsigned long test_check_cc(int cc, unsigned long cpsr) +{ + unsigned long temp; + + switch (cc) { + case 0x0: /* eq */ + return cpsr & PSR_Z_BIT; + + case 0x1: /* ne */ + return (~cpsr) & PSR_Z_BIT; + + case 0x2: /* cs */ + return cpsr & PSR_C_BIT; + + case 0x3: /* cc */ + return (~cpsr) & PSR_C_BIT; + + case 0x4: /* mi */ + return cpsr & PSR_N_BIT; + + case 0x5: /* pl */ + return (~cpsr) & PSR_N_BIT; + + case 0x6: /* vs */ + return cpsr & PSR_V_BIT; + + case 0x7: /* vc */ + return (~cpsr) & PSR_V_BIT; + + case 0x8: /* hi */ + cpsr &= ~(cpsr >> 1); /* PSR_C_BIT &= ~PSR_Z_BIT */ + return cpsr & PSR_C_BIT; + + case 0x9: /* ls */ + cpsr &= ~(cpsr >> 1); /* PSR_C_BIT &= ~PSR_Z_BIT */ + return (~cpsr) & PSR_C_BIT; + + case 0xa: /* ge */ + cpsr ^= (cpsr << 3); /* PSR_N_BIT ^= PSR_V_BIT */ + return (~cpsr) & PSR_N_BIT; + + case 0xb: /* lt */ + cpsr ^= (cpsr << 3); /* PSR_N_BIT ^= PSR_V_BIT */ + return cpsr & PSR_N_BIT; + + case 0xc: /* gt */ + temp = cpsr ^ (cpsr << 3); /* PSR_N_BIT ^= PSR_V_BIT */ + temp |= (cpsr << 1); /* PSR_N_BIT |= PSR_Z_BIT */ + return (~temp) & PSR_N_BIT; + + case 0xd: /* le */ + temp = cpsr ^ (cpsr << 3); /* PSR_N_BIT ^= PSR_V_BIT */ + temp |= (cpsr << 1); /* PSR_N_BIT |= PSR_Z_BIT */ + return temp & PSR_N_BIT; + + case 0xe: /* al */ + case 0xf: /* unconditional */ + return true; + } + BUG(); + return false; +} + +static int is_last_scenario; +static int probe_should_run; /* 0 = no, 1 = yes, -1 = unknown */ +static int memory_needs_checking; + +static unsigned long test_context_cpsr(int scenario) +{ + unsigned long cpsr; + + probe_should_run = 1; + + /* Default case is that we cycle through 16 combinations of flags */ + cpsr = (scenario & 0xf) << 28; /* N,Z,C,V flags */ + cpsr |= (scenario & 0xf) << 16; /* GE flags */ + cpsr |= (scenario & 0x1) << 27; /* Toggle Q flag */ + + if (!test_case_is_thumb) { + /* Testing ARM code */ + probe_should_run = test_check_cc(current_instruction >> 28, cpsr) != 0; + if (scenario == 15) + is_last_scenario = true; + + } else if (kprobe_test_flags & TEST_FLAG_NO_ITBLOCK) { + /* Testing Thumb code without setting ITSTATE */ + if (kprobe_test_cc_position) { + int cc = (current_instruction >> kprobe_test_cc_position) & 0xf; + probe_should_run = test_check_cc(cc, cpsr) != 0; + } + + if (scenario == 15) + is_last_scenario = true; + + } else if (kprobe_test_flags & TEST_FLAG_FULL_ITBLOCK) { + /* Testing Thumb code with all combinations of ITSTATE */ + unsigned x = (scenario >> 4); + unsigned cond_base = x % 7; /* ITSTATE<7:5> */ + unsigned mask = x / 7 + 2; /* ITSTATE<4:0>, bits reversed */ + + if (mask > 0x1f) { + /* Finish by testing state from instruction 'itt al' */ + cond_base = 7; + mask = 0x4; + if ((scenario & 0xf) == 0xf) + is_last_scenario = true; + } + + cpsr |= cond_base << 13; /* ITSTATE<7:5> */ + cpsr |= (mask & 0x1) << 12; /* ITSTATE<4> */ + cpsr |= (mask & 0x2) << 10; /* ITSTATE<3> */ + cpsr |= (mask & 0x4) << 8; /* ITSTATE<2> */ + cpsr |= (mask & 0x8) << 23; /* ITSTATE<1> */ + cpsr |= (mask & 0x10) << 21; /* ITSTATE<0> */ + + probe_should_run = test_check_cc((cpsr >> 12) & 0xf, cpsr) != 0; + + } else { + /* Testing Thumb code with several combinations of ITSTATE */ + switch (scenario) { + case 16: /* Clear NZCV flags and 'it eq' state (false as Z=0) */ + cpsr = 0x00000800; + probe_should_run = 0; + break; + case 17: /* Set NZCV flags and 'it vc' state (false as V=1) */ + cpsr = 0xf0007800; + probe_should_run = 0; + break; + case 18: /* Clear NZCV flags and 'it ls' state (true as C=0) */ + cpsr = 0x00009800; + break; + case 19: /* Set NZCV flags and 'it cs' state (true as C=1) */ + cpsr = 0xf0002800; + is_last_scenario = true; + break; + } + } + + return cpsr; +} + +static void setup_test_context(struct pt_regs *regs) +{ + int scenario = test_case_run_count>>1; + unsigned long val; + struct test_arg *args; + int i; + + is_last_scenario = false; + memory_needs_checking = false; + + /* Initialise test memory on stack */ + val = (scenario & 1) ? VALM : ~VALM; + for (i = 0; i < TEST_MEMORY_SIZE / sizeof(current_stack[0]); ++i) + current_stack[i] = val + (i << 8); + /* Put target of branch on stack for tests which load PC from memory */ + if (current_branch_target) + current_stack[15] = current_branch_target; + /* Put a value for SP on stack for tests which load SP from memory */ + current_stack[13] = (u32)current_stack + 120; + + /* Initialise register values to their default state */ + val = (scenario & 2) ? VALR : ~VALR; + for (i = 0; i < 13; ++i) + regs->uregs[i] = val ^ (i << 8); + regs->ARM_lr = val ^ (14 << 8); + regs->ARM_cpsr &= ~(APSR_MASK | PSR_IT_MASK); + regs->ARM_cpsr |= test_context_cpsr(scenario); + + /* Perform testcase specific register setup */ + args = current_args; + for (; args[0].type != ARG_TYPE_END; ++args) + switch (args[0].type) { + case ARG_TYPE_REG: { + struct test_arg_regptr *arg = + (struct test_arg_regptr *)args; + regs->uregs[arg->reg] = arg->val; + break; + } + case ARG_TYPE_PTR: { + struct test_arg_regptr *arg = + (struct test_arg_regptr *)args; + regs->uregs[arg->reg] = + (unsigned long)current_stack + arg->val; + memory_needs_checking = true; + break; + } + case ARG_TYPE_MEM: { + struct test_arg_mem *arg = (struct test_arg_mem *)args; + current_stack[arg->index] = arg->val; + break; + } + default: + break; + } +} + +struct test_probe { + struct kprobe kprobe; + bool registered; + int hit; +}; + +static void unregister_test_probe(struct test_probe *probe) +{ + if (probe->registered) { + unregister_kprobe(&probe->kprobe); + probe->kprobe.flags = 0; /* Clear disable flag to allow reuse */ + } + probe->registered = false; +} + +static int register_test_probe(struct test_probe *probe) +{ + int ret; + + if (probe->registered) + BUG(); + + ret = register_kprobe(&probe->kprobe); + if (ret >= 0) { + probe->registered = true; + probe->hit = -1; + } + return ret; +} + +static int __kprobes +test_before_pre_handler(struct kprobe *p, struct pt_regs *regs) +{ + container_of(p, struct test_probe, kprobe)->hit = test_instance; + return 0; +} + +static void __kprobes +test_before_post_handler(struct kprobe *p, struct pt_regs *regs, + unsigned long flags) +{ + setup_test_context(regs); + initial_regs = *regs; + initial_regs.ARM_cpsr &= ~PSR_IGNORE_BITS; +} + +static int __kprobes +test_case_pre_handler(struct kprobe *p, struct pt_regs *regs) +{ + container_of(p, struct test_probe, kprobe)->hit = test_instance; + return 0; +} + +static int __kprobes +test_after_pre_handler(struct kprobe *p, struct pt_regs *regs) +{ + if (container_of(p, struct test_probe, kprobe)->hit == test_instance) + return 0; /* Already run for this test instance */ + + result_regs = *regs; + result_regs.ARM_cpsr &= ~PSR_IGNORE_BITS; + + /* Undo any changes done to SP by the test case */ + regs->ARM_sp = (unsigned long)current_stack; + + container_of(p, struct test_probe, kprobe)->hit = test_instance; + return 0; +} + +static struct test_probe test_before_probe = { + .kprobe.pre_handler = test_before_pre_handler, + .kprobe.post_handler = test_before_post_handler, +}; + +static struct test_probe test_case_probe = { + .kprobe.pre_handler = test_case_pre_handler, +}; + +static struct test_probe test_after_probe = { + .kprobe.pre_handler = test_after_pre_handler, +}; + +static struct test_probe test_after2_probe = { + .kprobe.pre_handler = test_after_pre_handler, +}; + +static void test_case_cleanup(void) +{ + unregister_test_probe(&test_before_probe); + unregister_test_probe(&test_case_probe); + unregister_test_probe(&test_after_probe); + unregister_test_probe(&test_after2_probe); +} + +static void print_registers(struct pt_regs *regs) +{ + pr_err("r0 %08lx | r1 %08lx | r2 %08lx | r3 %08lx\n", + regs->ARM_r0, regs->ARM_r1, regs->ARM_r2, regs->ARM_r3); + pr_err("r4 %08lx | r5 %08lx | r6 %08lx | r7 %08lx\n", + regs->ARM_r4, regs->ARM_r5, regs->ARM_r6, regs->ARM_r7); + pr_err("r8 %08lx | r9 %08lx | r10 %08lx | r11 %08lx\n", + regs->ARM_r8, regs->ARM_r9, regs->ARM_r10, regs->ARM_fp); + pr_err("r12 %08lx | sp %08lx | lr %08lx | pc %08lx\n", + regs->ARM_ip, regs->ARM_sp, regs->ARM_lr, regs->ARM_pc); + pr_err("cpsr %08lx\n", regs->ARM_cpsr); +} + +static void print_memory(u32 *mem, size_t size) +{ + int i; + for (i = 0; i < size / sizeof(u32); i += 4) + pr_err("%08x %08x %08x %08x\n", mem[i], mem[i+1], + mem[i+2], mem[i+3]); +} + +static size_t expected_memory_size(u32 *sp) +{ + size_t size = sizeof(expected_memory); + int offset = (uintptr_t)sp - (uintptr_t)current_stack; + if (offset > 0) + size -= offset; + return size; +} + +static void test_case_failed(const char *message) +{ + test_case_cleanup(); + + pr_err("FAIL: %s\n", message); + pr_err("FAIL: Test %s\n", current_title); + pr_err("FAIL: Scenario %d\n", test_case_run_count >> 1); +} + +static unsigned long next_instruction(unsigned long pc) +{ +#ifdef CONFIG_THUMB2_KERNEL + if ((pc & 1) && !is_wide_instruction(*(u16 *)(pc - 1))) + return pc + 2; + else +#endif + return pc + 4; +} + +static uintptr_t __used kprobes_test_case_start(const char *title, void *stack) +{ + struct test_arg *args; + struct test_arg_end *end_arg; + unsigned long test_code; + + args = (struct test_arg *)PTR_ALIGN(title + strlen(title) + 1, 4); + + current_title = title; + current_args = args; + current_stack = stack; + + ++test_try_count; + + while (args->type != ARG_TYPE_END) + ++args; + end_arg = (struct test_arg_end *)args; + + test_code = (unsigned long)(args + 1); /* Code starts after args */ + + test_case_is_thumb = end_arg->flags & ARG_FLAG_THUMB; + if (test_case_is_thumb) + test_code |= 1; + + current_code_start = test_code; + + current_branch_target = 0; + if (end_arg->branch_offset != end_arg->end_offset) + current_branch_target = test_code + end_arg->branch_offset; + + test_code += end_arg->code_offset; + test_before_probe.kprobe.addr = (kprobe_opcode_t *)test_code; + + test_code = next_instruction(test_code); + test_case_probe.kprobe.addr = (kprobe_opcode_t *)test_code; + + if (test_case_is_thumb) { + u16 *p = (u16 *)(test_code & ~1); + current_instruction = p[0]; + if (is_wide_instruction(current_instruction)) { + current_instruction <<= 16; + current_instruction |= p[1]; + } + } else { + current_instruction = *(u32 *)test_code; + } + + if (current_title[0] == '.') + verbose("%s\n", current_title); + else + verbose("%s\t@ %0*x\n", current_title, + test_case_is_thumb ? 4 : 8, + current_instruction); + + test_code = next_instruction(test_code); + test_after_probe.kprobe.addr = (kprobe_opcode_t *)test_code; + + if (kprobe_test_flags & TEST_FLAG_NARROW_INSTR) { + if (!test_case_is_thumb || + is_wide_instruction(current_instruction)) { + test_case_failed("expected 16-bit instruction"); + goto fail; + } + } else { + if (test_case_is_thumb && + !is_wide_instruction(current_instruction)) { + test_case_failed("expected 32-bit instruction"); + goto fail; + } + } + + coverage_add(current_instruction); + + if (end_arg->flags & ARG_FLAG_UNSUPPORTED) { + if (register_test_probe(&test_case_probe) < 0) + goto pass; + test_case_failed("registered probe for unsupported instruction"); + goto fail; + } + + if (end_arg->flags & ARG_FLAG_SUPPORTED) { + if (register_test_probe(&test_case_probe) >= 0) + goto pass; + test_case_failed("couldn't register probe for supported instruction"); + goto fail; + } + + if (register_test_probe(&test_before_probe) < 0) { + test_case_failed("register test_before_probe failed"); + goto fail; + } + if (register_test_probe(&test_after_probe) < 0) { + test_case_failed("register test_after_probe failed"); + goto fail; + } + if (current_branch_target) { + test_after2_probe.kprobe.addr = + (kprobe_opcode_t *)current_branch_target; + if (register_test_probe(&test_after2_probe) < 0) { + test_case_failed("register test_after2_probe failed"); + goto fail; + } + } + + /* Start first run of test case */ + test_case_run_count = 0; + ++test_instance; + return current_code_start; +pass: + test_case_run_count = TEST_CASE_PASSED; + return (uintptr_t)test_after_probe.kprobe.addr; +fail: + test_case_run_count = TEST_CASE_FAILED; + return (uintptr_t)test_after_probe.kprobe.addr; +} + +static bool check_test_results(void) +{ + size_t mem_size = 0; + u32 *mem = 0; + + if (memcmp(&expected_regs, &result_regs, sizeof(expected_regs))) { + test_case_failed("registers differ"); + goto fail; + } + + if (memory_needs_checking) { + mem = (u32 *)result_regs.ARM_sp; + mem_size = expected_memory_size(mem); + if (memcmp(expected_memory, mem, mem_size)) { + test_case_failed("test memory differs"); + goto fail; + } + } + + return true; + +fail: + pr_err("initial_regs:\n"); + print_registers(&initial_regs); + pr_err("expected_regs:\n"); + print_registers(&expected_regs); + pr_err("result_regs:\n"); + print_registers(&result_regs); + + if (mem) { + pr_err("current_stack=%p\n", current_stack); + pr_err("expected_memory:\n"); + print_memory(expected_memory, mem_size); + pr_err("result_memory:\n"); + print_memory(mem, mem_size); + } + + return false; +} + +static uintptr_t __used kprobes_test_case_end(void) +{ + if (test_case_run_count < 0) { + if (test_case_run_count == TEST_CASE_PASSED) + /* kprobes_test_case_start did all the needed testing */ + goto pass; + else + /* kprobes_test_case_start failed */ + goto fail; + } + + if (test_before_probe.hit != test_instance) { + test_case_failed("test_before_handler not run"); + goto fail; + } + + if (test_after_probe.hit != test_instance && + test_after2_probe.hit != test_instance) { + test_case_failed("test_after_handler not run"); + goto fail; + } + + /* + * Even numbered test runs ran without a probe on the test case so + * we can gather reference results. The subsequent odd numbered run + * will have the probe inserted. + */ + if ((test_case_run_count & 1) == 0) { + /* Save results from run without probe */ + u32 *mem = (u32 *)result_regs.ARM_sp; + expected_regs = result_regs; + memcpy(expected_memory, mem, expected_memory_size(mem)); + + /* Insert probe onto test case instruction */ + if (register_test_probe(&test_case_probe) < 0) { + test_case_failed("register test_case_probe failed"); + goto fail; + } + } else { + /* Check probe ran as expected */ + if (probe_should_run == 1) { + if (test_case_probe.hit != test_instance) { + test_case_failed("test_case_handler not run"); + goto fail; + } + } else if (probe_should_run == 0) { + if (test_case_probe.hit == test_instance) { + test_case_failed("test_case_handler ran"); + goto fail; + } + } + + /* Remove probe for any subsequent reference run */ + unregister_test_probe(&test_case_probe); + + if (!check_test_results()) + goto fail; + + if (is_last_scenario) + goto pass; + } + + /* Do next test run */ + ++test_case_run_count; + ++test_instance; + return current_code_start; +fail: + ++test_fail_count; + goto end; +pass: + ++test_pass_count; +end: + test_case_cleanup(); + return 0; +} + + +/* + * Top level test functions + */ + +static int run_test_cases(void (*tests)(void), const union decode_item *table) +{ + int ret; + + pr_info(" Check decoding tables\n"); + ret = table_test(table); + if (ret) + return ret; + + pr_info(" Run test cases\n"); + ret = coverage_start(table); + if (ret) + return ret; + + tests(); + + coverage_end(); + return 0; +} + + +static int __init run_all_tests(void) +{ + int ret = 0; + + pr_info("Begining kprobe tests...\n"); + +#ifndef CONFIG_THUMB2_KERNEL + + pr_info("Probe ARM code\n"); + ret = run_api_tests(arm_func); + if (ret) + goto out; + + pr_info("ARM instruction simulation\n"); + ret = run_test_cases(kprobe_arm_test_cases, kprobe_decode_arm_table); + if (ret) + goto out; + +#else /* CONFIG_THUMB2_KERNEL */ + + pr_info("Probe 16-bit Thumb code\n"); + ret = run_api_tests(thumb16_func); + if (ret) + goto out; + + pr_info("Probe 32-bit Thumb code, even halfword\n"); + ret = run_api_tests(thumb32even_func); + if (ret) + goto out; + + pr_info("Probe 32-bit Thumb code, odd halfword\n"); + ret = run_api_tests(thumb32odd_func); + if (ret) + goto out; + + pr_info("16-bit Thumb instruction simulation\n"); + ret = run_test_cases(kprobe_thumb16_test_cases, + kprobe_decode_thumb16_table); + if (ret) + goto out; + + pr_info("32-bit Thumb instruction simulation\n"); + ret = run_test_cases(kprobe_thumb32_test_cases, + kprobe_decode_thumb32_table); + if (ret) + goto out; +#endif + + pr_info("Total instruction simulation tests=%d, pass=%d fail=%d\n", + test_try_count, test_pass_count, test_fail_count); + if (test_fail_count) { + ret = -EINVAL; + goto out; + } + +#if BENCHMARKING + pr_info("Benchmarks\n"); + ret = run_benchmarks(); + if (ret) + goto out; +#endif + +#if __LINUX_ARM_ARCH__ >= 7 + /* We are able to run all test cases so coverage should be complete */ + if (coverage_fail) { + pr_err("FAIL: Test coverage checks failed\n"); + ret = -EINVAL; + goto out; + } +#endif + +out: + if (ret == 0) + pr_info("Finished kprobe tests OK\n"); + else + pr_err("kprobe tests failed\n"); + + return ret; +} + + +/* + * Module setup + */ + +#ifdef MODULE + +static void __exit kprobe_test_exit(void) +{ +} + +module_init(run_all_tests) +module_exit(kprobe_test_exit) +MODULE_LICENSE("GPL"); + +#else /* !MODULE */ + +late_initcall(run_all_tests); + +#endif diff --git a/arch/arm/kernel/kprobes-test.h b/arch/arm/kernel/kprobes-test.h new file mode 100644 index 000000000000..0dc5d77b9356 --- /dev/null +++ b/arch/arm/kernel/kprobes-test.h @@ -0,0 +1,392 @@ +/* + * arch/arm/kernel/kprobes-test.h + * + * Copyright (C) 2011 Jon Medhurst . + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License version 2 as + * published by the Free Software Foundation. + */ + +#define VERBOSE 0 /* Set to '1' for more logging of test cases */ + +#ifdef CONFIG_THUMB2_KERNEL +#define NORMAL_ISA "16" +#else +#define NORMAL_ISA "32" +#endif + + +/* Flags used in kprobe_test_flags */ +#define TEST_FLAG_NO_ITBLOCK (1<<0) +#define TEST_FLAG_FULL_ITBLOCK (1<<1) +#define TEST_FLAG_NARROW_INSTR (1<<2) + +extern int kprobe_test_flags; +extern int kprobe_test_cc_position; + + +#define TEST_MEMORY_SIZE 256 + + +/* + * Test case structures. + * + * The arguments given to test cases can be one of three types. + * + * ARG_TYPE_REG + * Load a register with the given value. + * + * ARG_TYPE_PTR + * Load a register with a pointer into the stack buffer (SP + given value). + * + * ARG_TYPE_MEM + * Store the given value into the stack buffer at [SP+index]. + * + */ + +#define ARG_TYPE_END 0 +#define ARG_TYPE_REG 1 +#define ARG_TYPE_PTR 2 +#define ARG_TYPE_MEM 3 + +#define ARG_FLAG_UNSUPPORTED 0x01 +#define ARG_FLAG_SUPPORTED 0x02 +#define ARG_FLAG_THUMB 0x10 /* Must be 16 so TEST_ISA can be used */ +#define ARG_FLAG_ARM 0x20 /* Must be 32 so TEST_ISA can be used */ + +struct test_arg { + u8 type; /* ARG_TYPE_x */ + u8 _padding[7]; +}; + +struct test_arg_regptr { + u8 type; /* ARG_TYPE_REG or ARG_TYPE_PTR */ + u8 reg; + u8 _padding[2]; + u32 val; +}; + +struct test_arg_mem { + u8 type; /* ARG_TYPE_MEM */ + u8 index; + u8 _padding[2]; + u32 val; +}; + +struct test_arg_end { + u8 type; /* ARG_TYPE_END */ + u8 flags; /* ARG_FLAG_x */ + u16 code_offset; + u16 branch_offset; + u16 end_offset; +}; + + +/* + * Building blocks for test cases. + * + * Each test case is wrapped between TESTCASE_START and TESTCASE_END. + * + * To specify arguments for a test case the TEST_ARG_{REG,PTR,MEM} macros are + * used followed by a terminating TEST_ARG_END. + * + * After this, the instruction to be tested is defined with TEST_INSTRUCTION. + * Or for branches, TEST_BRANCH_B and TEST_BRANCH_F (branch forwards/backwards). + * + * Some specific test cases may make use of other custom constructs. + */ + +#if VERBOSE +#define verbose(fmt, ...) pr_info(fmt, ##__VA_ARGS__) +#else +#define verbose(fmt, ...) +#endif + +#define TEST_GROUP(title) \ + verbose("\n"); \ + verbose(title"\n"); \ + verbose("---------------------------------------------------------\n"); + +#define TESTCASE_START(title) \ + __asm__ __volatile__ ( \ + "bl __kprobes_test_case_start \n\t" \ + /* don't use .asciz here as 'title' may be */ \ + /* multiple strings to be concatenated. */ \ + ".ascii "#title" \n\t" \ + ".byte 0 \n\t" \ + ".align 2 \n\t" + +#define TEST_ARG_REG(reg, val) \ + ".byte "__stringify(ARG_TYPE_REG)" \n\t" \ + ".byte "#reg" \n\t" \ + ".short 0 \n\t" \ + ".word "#val" \n\t" + +#define TEST_ARG_PTR(reg, val) \ + ".byte "__stringify(ARG_TYPE_PTR)" \n\t" \ + ".byte "#reg" \n\t" \ + ".short 0 \n\t" \ + ".word "#val" \n\t" + +#define TEST_ARG_MEM(index, val) \ + ".byte "__stringify(ARG_TYPE_MEM)" \n\t" \ + ".byte "#index" \n\t" \ + ".short 0 \n\t" \ + ".word "#val" \n\t" + +#define TEST_ARG_END(flags) \ + ".byte "__stringify(ARG_TYPE_END)" \n\t" \ + ".byte "TEST_ISA flags" \n\t" \ + ".short 50f-0f \n\t" \ + ".short 2f-0f \n\t" \ + ".short 99f-0f \n\t" \ + ".code "TEST_ISA" \n\t" \ + "0: \n\t" + +#define TEST_INSTRUCTION(instruction) \ + "50: nop \n\t" \ + "1: "instruction" \n\t" \ + " nop \n\t" + +#define TEST_BRANCH_F(instruction, xtra_dist) \ + TEST_INSTRUCTION(instruction) \ + ".if "#xtra_dist" \n\t" \ + " b 99f \n\t" \ + ".space "#xtra_dist" \n\t" \ + ".endif \n\t" \ + " b 99f \n\t" \ + "2: nop \n\t" + +#define TEST_BRANCH_B(instruction, xtra_dist) \ + " b 50f \n\t" \ + " b 99f \n\t" \ + "2: nop \n\t" \ + " b 99f \n\t" \ + ".if "#xtra_dist" \n\t" \ + ".space "#xtra_dist" \n\t" \ + ".endif \n\t" \ + TEST_INSTRUCTION(instruction) + +#define TESTCASE_END \ + "2: \n\t" \ + "99: \n\t" \ + " bl __kprobes_test_case_end_"TEST_ISA" \n\t" \ + ".code "NORMAL_ISA" \n\t" \ + : : \ + : "r0", "r1", "r2", "r3", "ip", "lr", "memory", "cc" \ + ); + + +/* + * Macros to define test cases. + * + * Those of the form TEST_{R,P,M}* can be used to define test cases + * which take combinations of the three basic types of arguments. E.g. + * + * TEST_R One register argument + * TEST_RR Two register arguments + * TEST_RPR A register, a pointer, then a register argument + * + * For testing instructions which may branch, there are macros TEST_BF_* + * and TEST_BB_* for branching forwards and backwards. + * + * TEST_SUPPORTED and TEST_UNSUPPORTED don't cause the code to be executed, + * the just verify that a kprobe is or is not allowed on the given instruction. + */ + +#define TEST(code) \ + TESTCASE_START(code) \ + TEST_ARG_END("") \ + TEST_INSTRUCTION(code) \ + TESTCASE_END + +#define TEST_UNSUPPORTED(code) \ + TESTCASE_START(code) \ + TEST_ARG_END("|"__stringify(ARG_FLAG_UNSUPPORTED)) \ + TEST_INSTRUCTION(code) \ + TESTCASE_END + +#define TEST_SUPPORTED(code) \ + TESTCASE_START(code) \ + TEST_ARG_END("|"__stringify(ARG_FLAG_SUPPORTED)) \ + TEST_INSTRUCTION(code) \ + TESTCASE_END + +#define TEST_R(code1, reg, val, code2) \ + TESTCASE_START(code1 #reg code2) \ + TEST_ARG_REG(reg, val) \ + TEST_ARG_END("") \ + TEST_INSTRUCTION(code1 #reg code2) \ + TESTCASE_END + +#define TEST_RR(code1, reg1, val1, code2, reg2, val2, code3) \ + TESTCASE_START(code1 #reg1 code2 #reg2 code3) \ + TEST_ARG_REG(reg1, val1) \ + TEST_ARG_REG(reg2, val2) \ + TEST_ARG_END("") \ + TEST_INSTRUCTION(code1 #reg1 code2 #reg2 code3) \ + TESTCASE_END + +#define TEST_RRR(code1, reg1, val1, code2, reg2, val2, code3, reg3, val3, code4)\ + TESTCASE_START(code1 #reg1 code2 #reg2 code3 #reg3 code4) \ + TEST_ARG_REG(reg1, val1) \ + TEST_ARG_REG(reg2, val2) \ + TEST_ARG_REG(reg3, val3) \ + TEST_ARG_END("") \ + TEST_INSTRUCTION(code1 #reg1 code2 #reg2 code3 #reg3 code4) \ + TESTCASE_END + +#define TEST_RRRR(code1, reg1, val1, code2, reg2, val2, code3, reg3, val3, code4, reg4, val4) \ + TESTCASE_START(code1 #reg1 code2 #reg2 code3 #reg3 code4 #reg4) \ + TEST_ARG_REG(reg1, val1) \ + TEST_ARG_REG(reg2, val2) \ + TEST_ARG_REG(reg3, val3) \ + TEST_ARG_REG(reg4, val4) \ + TEST_ARG_END("") \ + TEST_INSTRUCTION(code1 #reg1 code2 #reg2 code3 #reg3 code4 #reg4) \ + TESTCASE_END + +#define TEST_P(code1, reg1, val1, code2) \ + TESTCASE_START(code1 #reg1 code2) \ + TEST_ARG_PTR(reg1, val1) \ + TEST_ARG_END("") \ + TEST_INSTRUCTION(code1 #reg1 code2) \ + TESTCASE_END + +#define TEST_PR(code1, reg1, val1, code2, reg2, val2, code3) \ + TESTCASE_START(code1 #reg1 code2 #reg2 code3) \ + TEST_ARG_PTR(reg1, val1) \ + TEST_ARG_REG(reg2, val2) \ + TEST_ARG_END("") \ + TEST_INSTRUCTION(code1 #reg1 code2 #reg2 code3) \ + TESTCASE_END + +#define TEST_RP(code1, reg1, val1, code2, reg2, val2, code3) \ + TESTCASE_START(code1 #reg1 code2 #reg2 code3) \ + TEST_ARG_REG(reg1, val1) \ + TEST_ARG_PTR(reg2, val2) \ + TEST_ARG_END("") \ + TEST_INSTRUCTION(code1 #reg1 code2 #reg2 code3) \ + TESTCASE_END + +#define TEST_PRR(code1, reg1, val1, code2, reg2, val2, code3, reg3, val3, code4)\ + TESTCASE_START(code1 #reg1 code2 #reg2 code3 #reg3 code4) \ + TEST_ARG_PTR(reg1, val1) \ + TEST_ARG_REG(reg2, val2) \ + TEST_ARG_REG(reg3, val3) \ + TEST_ARG_END("") \ + TEST_INSTRUCTION(code1 #reg1 code2 #reg2 code3 #reg3 code4) \ + TESTCASE_END + +#define TEST_RPR(code1, reg1, val1, code2, reg2, val2, code3, reg3, val3, code4)\ + TESTCASE_START(code1 #reg1 code2 #reg2 code3 #reg3 code4) \ + TEST_ARG_REG(reg1, val1) \ + TEST_ARG_PTR(reg2, val2) \ + TEST_ARG_REG(reg3, val3) \ + TEST_ARG_END("") \ + TEST_INSTRUCTION(code1 #reg1 code2 #reg2 code3 #reg3 code4) \ + TESTCASE_END + +#define TEST_RRP(code1, reg1, val1, code2, reg2, val2, code3, reg3, val3, code4)\ + TESTCASE_START(code1 #reg1 code2 #reg2 code3 #reg3 code4) \ + TEST_ARG_REG(reg1, val1) \ + TEST_ARG_REG(reg2, val2) \ + TEST_ARG_PTR(reg3, val3) \ + TEST_ARG_END("") \ + TEST_INSTRUCTION(code1 #reg1 code2 #reg2 code3 #reg3 code4) \ + TESTCASE_END + +#define TEST_BF_P(code1, reg1, val1, code2) \ + TESTCASE_START(code1 #reg1 code2) \ + TEST_ARG_PTR(reg1, val1) \ + TEST_ARG_END("") \ + TEST_BRANCH_F(code1 #reg1 code2, 0) \ + TESTCASE_END + +#define TEST_BF_X(code, xtra_dist) \ + TESTCASE_START(code) \ + TEST_ARG_END("") \ + TEST_BRANCH_F(code, xtra_dist) \ + TESTCASE_END + +#define TEST_BB_X(code, xtra_dist) \ + TESTCASE_START(code) \ + TEST_ARG_END("") \ + TEST_BRANCH_B(code, xtra_dist) \ + TESTCASE_END + +#define TEST_BF_RX(code1, reg, val, code2, xtra_dist) \ + TESTCASE_START(code1 #reg code2) \ + TEST_ARG_REG(reg, val) \ + TEST_ARG_END("") \ + TEST_BRANCH_F(code1 #reg code2, xtra_dist) \ + TESTCASE_END + +#define TEST_BB_RX(code1, reg, val, code2, xtra_dist) \ + TESTCASE_START(code1 #reg code2) \ + TEST_ARG_REG(reg, val) \ + TEST_ARG_END("") \ + TEST_BRANCH_B(code1 #reg code2, xtra_dist) \ + TESTCASE_END + +#define TEST_BF(code) TEST_BF_X(code, 0) +#define TEST_BB(code) TEST_BB_X(code, 0) + +#define TEST_BF_R(code1, reg, val, code2) TEST_BF_RX(code1, reg, val, code2, 0) +#define TEST_BB_R(code1, reg, val, code2) TEST_BB_RX(code1, reg, val, code2, 0) + +#define TEST_BF_RR(code1, reg1, val1, code2, reg2, val2, code3) \ + TESTCASE_START(code1 #reg1 code2 #reg2 code3) \ + TEST_ARG_REG(reg1, val1) \ + TEST_ARG_REG(reg2, val2) \ + TEST_ARG_END("") \ + TEST_BRANCH_F(code1 #reg1 code2 #reg2 code3, 0) \ + TESTCASE_END + +#define TEST_X(code, codex) \ + TESTCASE_START(code) \ + TEST_ARG_END("") \ + TEST_INSTRUCTION(code) \ + " b 99f \n\t" \ + " "codex" \n\t" \ + TESTCASE_END + +#define TEST_RX(code1, reg, val, code2, codex) \ + TESTCASE_START(code1 #reg code2) \ + TEST_ARG_REG(reg, val) \ + TEST_ARG_END("") \ + TEST_INSTRUCTION(code1 __stringify(reg) code2) \ + " b 99f \n\t" \ + " "codex" \n\t" \ + TESTCASE_END + +#define TEST_RRX(code1, reg1, val1, code2, reg2, val2, code3, codex) \ + TESTCASE_START(code1 #reg1 code2 #reg2 code3) \ + TEST_ARG_REG(reg1, val1) \ + TEST_ARG_REG(reg2, val2) \ + TEST_ARG_END("") \ + TEST_INSTRUCTION(code1 __stringify(reg1) code2 __stringify(reg2) code3) \ + " b 99f \n\t" \ + " "codex" \n\t" \ + TESTCASE_END + + +/* Various values used in test cases... */ +#define N(val) (val ^ 0xffffffff) +#define VAL1 0x12345678 +#define VAL2 N(VAL1) +#define VAL3 0xa5f801 +#define VAL4 N(VAL3) +#define VALM 0x456789ab +#define VALR 0xdeaddead +#define HH1 0x0123fecb +#define HH2 0xa9874567 + + +#ifdef CONFIG_THUMB2_KERNEL +void kprobe_thumb16_test_cases(void); +void kprobe_thumb32_test_cases(void); +#else +void kprobe_arm_test_cases(void); +#endif diff --git a/arch/arm/kernel/kprobes-thumb.c b/arch/arm/kernel/kprobes-thumb.c index 902ca59e8b11..8f96ec778e8d 100644 --- a/arch/arm/kernel/kprobes-thumb.c +++ b/arch/arm/kernel/kprobes-thumb.c @@ -10,6 +10,7 @@ #include #include +#include #include "kprobes.h" @@ -943,6 +944,9 @@ const union decode_item kprobe_decode_thumb32_table[] = { */ DECODE_END }; +#ifdef CONFIG_ARM_KPROBES_TEST_MODULE +EXPORT_SYMBOL_GPL(kprobe_decode_thumb32_table); +#endif static void __kprobes t16_simulate_bxblx(struct kprobe *p, struct pt_regs *regs) @@ -1423,6 +1427,9 @@ const union decode_item kprobe_decode_thumb16_table[] = { DECODE_END }; +#ifdef CONFIG_ARM_KPROBES_TEST_MODULE +EXPORT_SYMBOL_GPL(kprobe_decode_thumb16_table); +#endif static unsigned long __kprobes thumb_check_cc(unsigned long cpsr) { diff --git a/arch/arm/kernel/kprobes.h b/arch/arm/kernel/kprobes.h index a6aeda0a6c7f..38945f78f9f1 100644 --- a/arch/arm/kernel/kprobes.h +++ b/arch/arm/kernel/kprobes.h @@ -413,6 +413,14 @@ struct decode_reject { DECODE_HEADER(DECODE_TYPE_REJECT, _mask, _value, 0) +#ifdef CONFIG_THUMB2_KERNEL +extern const union decode_item kprobe_decode_thumb16_table[]; +extern const union decode_item kprobe_decode_thumb32_table[]; +#else +extern const union decode_item kprobe_decode_arm_table[]; +#endif + + int kprobe_decode_insn(kprobe_opcode_t insn, struct arch_specific_insn *asi, const union decode_item *table, bool thumb16); diff --git a/arch/arm/kernel/perf_event.c b/arch/arm/kernel/perf_event.c index 53c9c2610cbc..e6e5d7c84f1a 100644 --- a/arch/arm/kernel/perf_event.c +++ b/arch/arm/kernel/perf_event.c @@ -12,6 +12,7 @@ */ #define pr_fmt(fmt) "hw perfevents: " fmt +#include #include #include #include @@ -26,16 +27,8 @@ #include #include -static struct platform_device *pmu_device; - -/* - * Hardware lock to serialize accesses to PMU registers. Needed for the - * read/modify/write sequences. - */ -static DEFINE_RAW_SPINLOCK(pmu_lock); - /* - * ARMv6 supports a maximum of 3 events, starting from index 1. If we add + * ARMv6 supports a maximum of 3 events, starting from index 0. If we add * another platform that supports more, we need to increase this to be the * largest of all platforms. * @@ -43,62 +36,24 @@ static DEFINE_RAW_SPINLOCK(pmu_lock); * cycle counter CCNT + 31 events counters CNT0..30. * Cortex-A8 has 1+4 counters, Cortex-A9 has 1+6 counters. */ -#define ARMPMU_MAX_HWEVENTS 33 +#define ARMPMU_MAX_HWEVENTS 32 -/* The events for a given CPU. */ -struct cpu_hw_events { - /* - * The events that are active on the CPU for the given index. Index 0 - * is reserved. - */ - struct perf_event *events[ARMPMU_MAX_HWEVENTS]; - - /* - * A 1 bit for an index indicates that the counter is being used for - * an event. A 0 means that the counter can be used. - */ - unsigned long used_mask[BITS_TO_LONGS(ARMPMU_MAX_HWEVENTS)]; +static DEFINE_PER_CPU(struct perf_event * [ARMPMU_MAX_HWEVENTS], hw_events); +static DEFINE_PER_CPU(unsigned long [BITS_TO_LONGS(ARMPMU_MAX_HWEVENTS)], used_mask); +static DEFINE_PER_CPU(struct pmu_hw_events, cpu_hw_events); - /* - * A 1 bit for an index indicates that the counter is actively being - * used. - */ - unsigned long active_mask[BITS_TO_LONGS(ARMPMU_MAX_HWEVENTS)]; -}; -static DEFINE_PER_CPU(struct cpu_hw_events, cpu_hw_events); - -struct arm_pmu { - enum arm_perf_pmu_ids id; - const char *name; - irqreturn_t (*handle_irq)(int irq_num, void *dev); - void (*enable)(struct hw_perf_event *evt, int idx); - void (*disable)(struct hw_perf_event *evt, int idx); - int (*get_event_idx)(struct cpu_hw_events *cpuc, - struct hw_perf_event *hwc); - u32 (*read_counter)(int idx); - void (*write_counter)(int idx, u32 val); - void (*start)(void); - void (*stop)(void); - void (*reset)(void *); - const unsigned (*cache_map)[PERF_COUNT_HW_CACHE_MAX] - [PERF_COUNT_HW_CACHE_OP_MAX] - [PERF_COUNT_HW_CACHE_RESULT_MAX]; - const unsigned (*event_map)[PERF_COUNT_HW_MAX]; - u32 raw_event_mask; - int num_events; - u64 max_period; -}; +#define to_arm_pmu(p) (container_of(p, struct arm_pmu, pmu)) /* Set at runtime when we know what CPU type we are. */ -static const struct arm_pmu *armpmu; +static struct arm_pmu *cpu_pmu; enum arm_perf_pmu_ids armpmu_get_pmu_id(void) { int id = -ENODEV; - if (armpmu != NULL) - id = armpmu->id; + if (cpu_pmu != NULL) + id = cpu_pmu->id; return id; } @@ -109,8 +64,8 @@ armpmu_get_max_events(void) { int max_events = 0; - if (armpmu != NULL) - max_events = armpmu->num_events; + if (cpu_pmu != NULL) + max_events = cpu_pmu->num_events; return max_events; } @@ -130,7 +85,11 @@ EXPORT_SYMBOL_GPL(perf_num_counters); #define CACHE_OP_UNSUPPORTED 0xFFFF static int -armpmu_map_cache_event(u64 config) +armpmu_map_cache_event(const unsigned (*cache_map) + [PERF_COUNT_HW_CACHE_MAX] + [PERF_COUNT_HW_CACHE_OP_MAX] + [PERF_COUNT_HW_CACHE_RESULT_MAX], + u64 config) { unsigned int cache_type, cache_op, cache_result, ret; @@ -146,7 +105,7 @@ armpmu_map_cache_event(u64 config) if (cache_result >= PERF_COUNT_HW_CACHE_RESULT_MAX) return -EINVAL; - ret = (int)(*armpmu->cache_map)[cache_type][cache_op][cache_result]; + ret = (int)(*cache_map)[cache_type][cache_op][cache_result]; if (ret == CACHE_OP_UNSUPPORTED) return -ENOENT; @@ -155,23 +114,46 @@ armpmu_map_cache_event(u64 config) } static int -armpmu_map_event(u64 config) +armpmu_map_event(const unsigned (*event_map)[PERF_COUNT_HW_MAX], u64 config) { - int mapping = (*armpmu->event_map)[config]; - return mapping == HW_OP_UNSUPPORTED ? -EOPNOTSUPP : mapping; + int mapping = (*event_map)[config]; + return mapping == HW_OP_UNSUPPORTED ? -ENOENT : mapping; } static int -armpmu_map_raw_event(u64 config) +armpmu_map_raw_event(u32 raw_event_mask, u64 config) { - return (int)(config & armpmu->raw_event_mask); + return (int)(config & raw_event_mask); } -static int +static int map_cpu_event(struct perf_event *event, + const unsigned (*event_map)[PERF_COUNT_HW_MAX], + const unsigned (*cache_map) + [PERF_COUNT_HW_CACHE_MAX] + [PERF_COUNT_HW_CACHE_OP_MAX] + [PERF_COUNT_HW_CACHE_RESULT_MAX], + u32 raw_event_mask) +{ + u64 config = event->attr.config; + + switch (event->attr.type) { + case PERF_TYPE_HARDWARE: + return armpmu_map_event(event_map, config); + case PERF_TYPE_HW_CACHE: + return armpmu_map_cache_event(cache_map, config); + case PERF_TYPE_RAW: + return armpmu_map_raw_event(raw_event_mask, config); + } + + return -ENOENT; +} + +int armpmu_event_set_period(struct perf_event *event, struct hw_perf_event *hwc, int idx) { + struct arm_pmu *armpmu = to_arm_pmu(event->pmu); s64 left = local64_read(&hwc->period_left); s64 period = hwc->sample_period; int ret = 0; @@ -202,11 +184,12 @@ armpmu_event_set_period(struct perf_event *event, return ret; } -static u64 +u64 armpmu_event_update(struct perf_event *event, struct hw_perf_event *hwc, int idx, int overflow) { + struct arm_pmu *armpmu = to_arm_pmu(event->pmu); u64 delta, prev_raw_count, new_raw_count; again: @@ -246,11 +229,9 @@ armpmu_read(struct perf_event *event) static void armpmu_stop(struct perf_event *event, int flags) { + struct arm_pmu *armpmu = to_arm_pmu(event->pmu); struct hw_perf_event *hwc = &event->hw; - if (!armpmu) - return; - /* * ARM pmu always has to update the counter, so ignore * PERF_EF_UPDATE, see comments in armpmu_start(). @@ -266,11 +247,9 @@ armpmu_stop(struct perf_event *event, int flags) static void armpmu_start(struct perf_event *event, int flags) { + struct arm_pmu *armpmu = to_arm_pmu(event->pmu); struct hw_perf_event *hwc = &event->hw; - if (!armpmu) - return; - /* * ARM pmu always has to reprogram the period, so ignore * PERF_EF_RELOAD, see the comment below. @@ -293,16 +272,16 @@ armpmu_start(struct perf_event *event, int flags) static void armpmu_del(struct perf_event *event, int flags) { - struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events); + struct arm_pmu *armpmu = to_arm_pmu(event->pmu); + struct pmu_hw_events *hw_events = armpmu->get_hw_events(); struct hw_perf_event *hwc = &event->hw; int idx = hwc->idx; WARN_ON(idx < 0); - clear_bit(idx, cpuc->active_mask); armpmu_stop(event, PERF_EF_UPDATE); - cpuc->events[idx] = NULL; - clear_bit(idx, cpuc->used_mask); + hw_events->events[idx] = NULL; + clear_bit(idx, hw_events->used_mask); perf_event_update_userpage(event); } @@ -310,7 +289,8 @@ armpmu_del(struct perf_event *event, int flags) static int armpmu_add(struct perf_event *event, int flags) { - struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events); + struct arm_pmu *armpmu = to_arm_pmu(event->pmu); + struct pmu_hw_events *hw_events = armpmu->get_hw_events(); struct hw_perf_event *hwc = &event->hw; int idx; int err = 0; @@ -318,7 +298,7 @@ armpmu_add(struct perf_event *event, int flags) perf_pmu_disable(event->pmu); /* If we don't have a space for the counter then finish early. */ - idx = armpmu->get_event_idx(cpuc, hwc); + idx = armpmu->get_event_idx(hw_events, hwc); if (idx < 0) { err = idx; goto out; @@ -330,8 +310,7 @@ armpmu_add(struct perf_event *event, int flags) */ event->hw.idx = idx; armpmu->disable(hwc, idx); - cpuc->events[idx] = event; - set_bit(idx, cpuc->active_mask); + hw_events->events[idx] = event; hwc->state = PERF_HES_STOPPED | PERF_HES_UPTODATE; if (flags & PERF_EF_START) @@ -345,25 +324,25 @@ out: return err; } -static struct pmu pmu; - static int -validate_event(struct cpu_hw_events *cpuc, +validate_event(struct pmu_hw_events *hw_events, struct perf_event *event) { + struct arm_pmu *armpmu = to_arm_pmu(event->pmu); struct hw_perf_event fake_event = event->hw; + struct pmu *leader_pmu = event->group_leader->pmu; - if (event->pmu != &pmu || event->state <= PERF_EVENT_STATE_OFF) + if (event->pmu != leader_pmu || event->state <= PERF_EVENT_STATE_OFF) return 1; - return armpmu->get_event_idx(cpuc, &fake_event) >= 0; + return armpmu->get_event_idx(hw_events, &fake_event) >= 0; } static int validate_group(struct perf_event *event) { struct perf_event *sibling, *leader = event->group_leader; - struct cpu_hw_events fake_pmu; + struct pmu_hw_events fake_pmu; memset(&fake_pmu, 0, sizeof(fake_pmu)); @@ -383,110 +362,119 @@ validate_group(struct perf_event *event) static irqreturn_t armpmu_platform_irq(int irq, void *dev) { - struct arm_pmu_platdata *plat = dev_get_platdata(&pmu_device->dev); + struct arm_pmu *armpmu = (struct arm_pmu *) dev; + struct platform_device *plat_device = armpmu->plat_device; + struct arm_pmu_platdata *plat = dev_get_platdata(&plat_device->dev); return plat->handle_irq(irq, dev, armpmu->handle_irq); } +static void +armpmu_release_hardware(struct arm_pmu *armpmu) +{ + int i, irq, irqs; + struct platform_device *pmu_device = armpmu->plat_device; + + irqs = min(pmu_device->num_resources, num_possible_cpus()); + + for (i = 0; i < irqs; ++i) { + if (!cpumask_test_and_clear_cpu(i, &armpmu->active_irqs)) + continue; + irq = platform_get_irq(pmu_device, i); + if (irq >= 0) + free_irq(irq, armpmu); + } + + release_pmu(armpmu->type); +} + static int -armpmu_reserve_hardware(void) +armpmu_reserve_hardware(struct arm_pmu *armpmu) { struct arm_pmu_platdata *plat; irq_handler_t handle_irq; - int i, err = -ENODEV, irq; + int i, err, irq, irqs; + struct platform_device *pmu_device = armpmu->plat_device; - pmu_device = reserve_pmu(ARM_PMU_DEVICE_CPU); - if (IS_ERR(pmu_device)) { + err = reserve_pmu(armpmu->type); + if (err) { pr_warning("unable to reserve pmu\n"); - return PTR_ERR(pmu_device); + return err; } - init_pmu(ARM_PMU_DEVICE_CPU); - plat = dev_get_platdata(&pmu_device->dev); if (plat && plat->handle_irq) handle_irq = armpmu_platform_irq; else handle_irq = armpmu->handle_irq; - if (pmu_device->num_resources < 1) { + irqs = min(pmu_device->num_resources, num_possible_cpus()); + if (irqs < 1) { pr_err("no irqs for PMUs defined\n"); return -ENODEV; } - for (i = 0; i < pmu_device->num_resources; ++i) { + for (i = 0; i < irqs; ++i) { + err = 0; irq = platform_get_irq(pmu_device, i); if (irq < 0) continue; + /* + * If we have a single PMU interrupt that we can't shift, + * assume that we're running on a uniprocessor machine and + * continue. Otherwise, continue without this interrupt. + */ + if (irq_set_affinity(irq, cpumask_of(i)) && irqs > 1) { + pr_warning("unable to set irq affinity (irq=%d, cpu=%u)\n", + irq, i); + continue; + } + err = request_irq(irq, handle_irq, IRQF_DISABLED | IRQF_NOBALANCING, - "armpmu", NULL); + "arm-pmu", armpmu); if (err) { - pr_warning("unable to request IRQ%d for ARM perf " - "counters\n", irq); - break; + pr_err("unable to request IRQ%d for ARM PMU counters\n", + irq); + armpmu_release_hardware(armpmu); + return err; } - } - if (err) { - for (i = i - 1; i >= 0; --i) { - irq = platform_get_irq(pmu_device, i); - if (irq >= 0) - free_irq(irq, NULL); - } - release_pmu(ARM_PMU_DEVICE_CPU); - pmu_device = NULL; + cpumask_set_cpu(i, &armpmu->active_irqs); } - return err; + return 0; } static void -armpmu_release_hardware(void) +hw_perf_event_destroy(struct perf_event *event) { - int i, irq; + struct arm_pmu *armpmu = to_arm_pmu(event->pmu); + atomic_t *active_events = &armpmu->active_events; + struct mutex *pmu_reserve_mutex = &armpmu->reserve_mutex; - for (i = pmu_device->num_resources - 1; i >= 0; --i) { - irq = platform_get_irq(pmu_device, i); - if (irq >= 0) - free_irq(irq, NULL); + if (atomic_dec_and_mutex_lock(active_events, pmu_reserve_mutex)) { + armpmu_release_hardware(armpmu); + mutex_unlock(pmu_reserve_mutex); } - armpmu->stop(); - - release_pmu(ARM_PMU_DEVICE_CPU); - pmu_device = NULL; } -static atomic_t active_events = ATOMIC_INIT(0); -static DEFINE_MUTEX(pmu_reserve_mutex); - -static void -hw_perf_event_destroy(struct perf_event *event) +static int +event_requires_mode_exclusion(struct perf_event_attr *attr) { - if (atomic_dec_and_mutex_lock(&active_events, &pmu_reserve_mutex)) { - armpmu_release_hardware(); - mutex_unlock(&pmu_reserve_mutex); - } + return attr->exclude_idle || attr->exclude_user || + attr->exclude_kernel || attr->exclude_hv; } static int __hw_perf_event_init(struct perf_event *event) { + struct arm_pmu *armpmu = to_arm_pmu(event->pmu); struct hw_perf_event *hwc = &event->hw; int mapping, err; - /* Decode the generic type into an ARM event identifier. */ - if (PERF_TYPE_HARDWARE == event->attr.type) { - mapping = armpmu_map_event(event->attr.config); - } else if (PERF_TYPE_HW_CACHE == event->attr.type) { - mapping = armpmu_map_cache_event(event->attr.config); - } else if (PERF_TYPE_RAW == event->attr.type) { - mapping = armpmu_map_raw_event(event->attr.config); - } else { - pr_debug("event type %x not supported\n", event->attr.type); - return -EOPNOTSUPP; - } + mapping = armpmu->map_event(event); if (mapping < 0) { pr_debug("event %x:%llx not supported\n", event->attr.type, @@ -494,35 +482,32 @@ __hw_perf_event_init(struct perf_event *event) return mapping; } + /* + * We don't assign an index until we actually place the event onto + * hardware. Use -1 to signify that we haven't decided where to put it + * yet. For SMP systems, each core has it's own PMU so we can't do any + * clever allocation or constraints checking at this point. + */ + hwc->idx = -1; + hwc->config_base = 0; + hwc->config = 0; + hwc->event_base = 0; + /* * Check whether we need to exclude the counter from certain modes. - * The ARM performance counters are on all of the time so if someone - * has asked us for some excludes then we have to fail. */ - if (event->attr.exclude_kernel || event->attr.exclude_user || - event->attr.exclude_hv || event->attr.exclude_idle) { + if ((!armpmu->set_event_filter || + armpmu->set_event_filter(hwc, &event->attr)) && + event_requires_mode_exclusion(&event->attr)) { pr_debug("ARM performance counters do not support " "mode exclusion\n"); return -EPERM; } /* - * We don't assign an index until we actually place the event onto - * hardware. Use -1 to signify that we haven't decided where to put it - * yet. For SMP systems, each core has it's own PMU so we can't do any - * clever allocation or constraints checking at this point. + * Store the event encoding into the config_base field. */ - hwc->idx = -1; - - /* - * Store the event encoding into the config_base field. config and - * event_base are unused as the only 2 things we need to know are - * the event mapping and the counter to use. The counter to use is - * also the indx and the config_base is the event type. - */ - hwc->config_base = (unsigned long)mapping; - hwc->config = 0; - hwc->event_base = 0; + hwc->config_base |= (unsigned long)mapping; if (!hwc->sample_period) { hwc->sample_period = armpmu->max_period; @@ -542,32 +527,23 @@ __hw_perf_event_init(struct perf_event *event) static int armpmu_event_init(struct perf_event *event) { + struct arm_pmu *armpmu = to_arm_pmu(event->pmu); int err = 0; + atomic_t *active_events = &armpmu->active_events; - switch (event->attr.type) { - case PERF_TYPE_RAW: - case PERF_TYPE_HARDWARE: - case PERF_TYPE_HW_CACHE: - break; - - default: + if (armpmu->map_event(event) == -ENOENT) return -ENOENT; - } - - if (!armpmu) - return -ENODEV; event->destroy = hw_perf_event_destroy; - if (!atomic_inc_not_zero(&active_events)) { - mutex_lock(&pmu_reserve_mutex); - if (atomic_read(&active_events) == 0) { - err = armpmu_reserve_hardware(); - } + if (!atomic_inc_not_zero(active_events)) { + mutex_lock(&armpmu->reserve_mutex); + if (atomic_read(active_events) == 0) + err = armpmu_reserve_hardware(armpmu); if (!err) - atomic_inc(&active_events); - mutex_unlock(&pmu_reserve_mutex); + atomic_inc(active_events); + mutex_unlock(&armpmu->reserve_mutex); } if (err) @@ -582,22 +558,9 @@ static int armpmu_event_init(struct perf_event *event) static void armpmu_enable(struct pmu *pmu) { - /* Enable all of the perf events on hardware. */ - int idx, enabled = 0; - struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events); - - if (!armpmu) - return; - - for (idx = 0; idx <= armpmu->num_events; ++idx) { - struct perf_event *event = cpuc->events[idx]; - - if (!event) - continue; - - armpmu->enable(&event->hw, idx); - enabled = 1; - } + struct arm_pmu *armpmu = to_arm_pmu(pmu); + struct pmu_hw_events *hw_events = armpmu->get_hw_events(); + int enabled = bitmap_weight(hw_events->used_mask, armpmu->num_events); if (enabled) armpmu->start(); @@ -605,20 +568,32 @@ static void armpmu_enable(struct pmu *pmu) static void armpmu_disable(struct pmu *pmu) { - if (armpmu) - armpmu->stop(); + struct arm_pmu *armpmu = to_arm_pmu(pmu); + armpmu->stop(); } -static struct pmu pmu = { - .pmu_enable = armpmu_enable, - .pmu_disable = armpmu_disable, - .event_init = armpmu_event_init, - .add = armpmu_add, - .del = armpmu_del, - .start = armpmu_start, - .stop = armpmu_stop, - .read = armpmu_read, -}; +static void __init armpmu_init(struct arm_pmu *armpmu) +{ + atomic_set(&armpmu->active_events, 0); + mutex_init(&armpmu->reserve_mutex); + + armpmu->pmu = (struct pmu) { + .pmu_enable = armpmu_enable, + .pmu_disable = armpmu_disable, + .event_init = armpmu_event_init, + .add = armpmu_add, + .del = armpmu_del, + .start = armpmu_start, + .stop = armpmu_stop, + .read = armpmu_read, + }; +} + +int __init armpmu_register(struct arm_pmu *armpmu, char *name, int type) +{ + armpmu_init(armpmu); + return perf_pmu_register(&armpmu->pmu, name, type); +} /* Include the PMU-specific implementations. */ #include "perf_event_xscale.c" @@ -630,14 +605,72 @@ static struct pmu pmu = { * This requires SMP to be available, so exists as a separate initcall. */ static int __init -armpmu_reset(void) +cpu_pmu_reset(void) +{ + if (cpu_pmu && cpu_pmu->reset) + return on_each_cpu(cpu_pmu->reset, NULL, 1); + return 0; +} +arch_initcall(cpu_pmu_reset); + +/* + * PMU platform driver and devicetree bindings. + */ +static struct of_device_id armpmu_of_device_ids[] = { + {.compatible = "arm,cortex-a9-pmu"}, + {.compatible = "arm,cortex-a8-pmu"}, + {.compatible = "arm,arm1136-pmu"}, + {.compatible = "arm,arm1176-pmu"}, + {}, +}; + +static struct platform_device_id armpmu_plat_device_ids[] = { + {.name = "arm-pmu"}, + {}, +}; + +static int __devinit armpmu_device_probe(struct platform_device *pdev) { - if (armpmu && armpmu->reset) - return on_each_cpu(armpmu->reset, NULL, 1); + cpu_pmu->plat_device = pdev; return 0; } -arch_initcall(armpmu_reset); +static struct platform_driver armpmu_driver = { + .driver = { + .name = "arm-pmu", + .of_match_table = armpmu_of_device_ids, + }, + .probe = armpmu_device_probe, + .id_table = armpmu_plat_device_ids, +}; + +static int __init register_pmu_driver(void) +{ + return platform_driver_register(&armpmu_driver); +} +device_initcall(register_pmu_driver); + +static struct pmu_hw_events *armpmu_get_cpu_events(void) +{ + return &__get_cpu_var(cpu_hw_events); +} + +static void __init cpu_pmu_init(struct arm_pmu *armpmu) +{ + int cpu; + for_each_possible_cpu(cpu) { + struct pmu_hw_events *events = &per_cpu(cpu_hw_events, cpu); + events->events = per_cpu(hw_events, cpu); + events->used_mask = per_cpu(used_mask, cpu); + raw_spin_lock_init(&events->pmu_lock); + } + armpmu->get_hw_events = armpmu_get_cpu_events; + armpmu->type = ARM_PMU_DEVICE_CPU; +} + +/* + * CPU PMU identification and registration. + */ static int __init init_hw_perf_events(void) { @@ -651,22 +684,22 @@ init_hw_perf_events(void) case 0xB360: /* ARM1136 */ case 0xB560: /* ARM1156 */ case 0xB760: /* ARM1176 */ - armpmu = armv6pmu_init(); + cpu_pmu = armv6pmu_init(); break; case 0xB020: /* ARM11mpcore */ - armpmu = armv6mpcore_pmu_init(); + cpu_pmu = armv6mpcore_pmu_init(); break; case 0xC080: /* Cortex-A8 */ - armpmu = armv7_a8_pmu_init(); + cpu_pmu = armv7_a8_pmu_init(); break; case 0xC090: /* Cortex-A9 */ - armpmu = armv7_a9_pmu_init(); + cpu_pmu = armv7_a9_pmu_init(); break; case 0xC050: /* Cortex-A5 */ - armpmu = armv7_a5_pmu_init(); + cpu_pmu = armv7_a5_pmu_init(); break; case 0xC0F0: /* Cortex-A15 */ - armpmu = armv7_a15_pmu_init(); + cpu_pmu = armv7_a15_pmu_init(); break; } /* Intel CPUs [xscale]. */ @@ -674,23 +707,23 @@ init_hw_perf_events(void) part_number = (cpuid >> 13) & 0x7; switch (part_number) { case 1: - armpmu = xscale1pmu_init(); + cpu_pmu = xscale1pmu_init(); break; case 2: - armpmu = xscale2pmu_init(); + cpu_pmu = xscale2pmu_init(); break; } } - if (armpmu) { + if (cpu_pmu) { pr_info("enabled with %s PMU driver, %d counters available\n", - armpmu->name, armpmu->num_events); + cpu_pmu->name, cpu_pmu->num_events); + cpu_pmu_init(cpu_pmu); + armpmu_register(cpu_pmu, "cpu", PERF_TYPE_RAW); } else { pr_info("no hardware support available\n"); } - perf_pmu_register(&pmu, "cpu", PERF_TYPE_RAW); - return 0; } early_initcall(init_hw_perf_events); diff --git a/arch/arm/kernel/perf_event_v6.c b/arch/arm/kernel/perf_event_v6.c index dd7f3b9f4cb3..e63d8115c01b 100644 --- a/arch/arm/kernel/perf_event_v6.c +++ b/arch/arm/kernel/perf_event_v6.c @@ -54,7 +54,7 @@ enum armv6_perf_types { }; enum armv6_counters { - ARMV6_CYCLE_COUNTER = 1, + ARMV6_CYCLE_COUNTER = 0, ARMV6_COUNTER0, ARMV6_COUNTER1, }; @@ -433,6 +433,7 @@ armv6pmu_enable_event(struct hw_perf_event *hwc, int idx) { unsigned long val, mask, evt, flags; + struct pmu_hw_events *events = cpu_pmu->get_hw_events(); if (ARMV6_CYCLE_COUNTER == idx) { mask = 0; @@ -454,12 +455,29 @@ armv6pmu_enable_event(struct hw_perf_event *hwc, * Mask out the current event and set the counter to count the event * that we're interested in. */ - raw_spin_lock_irqsave(&pmu_lock, flags); + raw_spin_lock_irqsave(&events->pmu_lock, flags); val = armv6_pmcr_read(); val &= ~mask; val |= evt; armv6_pmcr_write(val); - raw_spin_unlock_irqrestore(&pmu_lock, flags); + raw_spin_unlock_irqrestore(&events->pmu_lock, flags); +} + +static int counter_is_active(unsigned long pmcr, int idx) +{ + unsigned long mask = 0; + if (idx == ARMV6_CYCLE_COUNTER) + mask = ARMV6_PMCR_CCOUNT_IEN; + else if (idx == ARMV6_COUNTER0) + mask = ARMV6_PMCR_COUNT0_IEN; + else if (idx == ARMV6_COUNTER1) + mask = ARMV6_PMCR_COUNT1_IEN; + + if (mask) + return pmcr & mask; + + WARN_ONCE(1, "invalid counter number (%d)\n", idx); + return 0; } static irqreturn_t @@ -468,7 +486,7 @@ armv6pmu_handle_irq(int irq_num, { unsigned long pmcr = armv6_pmcr_read(); struct perf_sample_data data; - struct cpu_hw_events *cpuc; + struct pmu_hw_events *cpuc; struct pt_regs *regs; int idx; @@ -487,11 +505,11 @@ armv6pmu_handle_irq(int irq_num, perf_sample_data_init(&data, 0); cpuc = &__get_cpu_var(cpu_hw_events); - for (idx = 0; idx <= armpmu->num_events; ++idx) { + for (idx = 0; idx < cpu_pmu->num_events; ++idx) { struct perf_event *event = cpuc->events[idx]; struct hw_perf_event *hwc; - if (!test_bit(idx, cpuc->active_mask)) + if (!counter_is_active(pmcr, idx)) continue; /* @@ -508,7 +526,7 @@ armv6pmu_handle_irq(int irq_num, continue; if (perf_event_overflow(event, &data, regs)) - armpmu->disable(hwc, idx); + cpu_pmu->disable(hwc, idx); } /* @@ -527,28 +545,30 @@ static void armv6pmu_start(void) { unsigned long flags, val; + struct pmu_hw_events *events = cpu_pmu->get_hw_events(); - raw_spin_lock_irqsave(&pmu_lock, flags); + raw_spin_lock_irqsave(&events->pmu_lock, flags); val = armv6_pmcr_read(); val |= ARMV6_PMCR_ENABLE; armv6_pmcr_write(val); - raw_spin_unlock_irqrestore(&pmu_lock, flags); + raw_spin_unlock_irqrestore(&events->pmu_lock, flags); } static void armv6pmu_stop(void) { unsigned long flags, val; + struct pmu_hw_events *events = cpu_pmu->get_hw_events(); - raw_spin_lock_irqsave(&pmu_lock, flags); + raw_spin_lock_irqsave(&events->pmu_lock, flags); val = armv6_pmcr_read(); val &= ~ARMV6_PMCR_ENABLE; armv6_pmcr_write(val); - raw_spin_unlock_irqrestore(&pmu_lock, flags); + raw_spin_unlock_irqrestore(&events->pmu_lock, flags); } static int -armv6pmu_get_event_idx(struct cpu_hw_events *cpuc, +armv6pmu_get_event_idx(struct pmu_hw_events *cpuc, struct hw_perf_event *event) { /* Always place a cycle counter into the cycle counter. */ @@ -578,6 +598,7 @@ armv6pmu_disable_event(struct hw_perf_event *hwc, int idx) { unsigned long val, mask, evt, flags; + struct pmu_hw_events *events = cpu_pmu->get_hw_events(); if (ARMV6_CYCLE_COUNTER == idx) { mask = ARMV6_PMCR_CCOUNT_IEN; @@ -598,12 +619,12 @@ armv6pmu_disable_event(struct hw_perf_event *hwc, * of ETM bus signal assertion cycles. The external reporting should * be disabled and so this should never increment. */ - raw_spin_lock_irqsave(&pmu_lock, flags); + raw_spin_lock_irqsave(&events->pmu_lock, flags); val = armv6_pmcr_read(); val &= ~mask; val |= evt; armv6_pmcr_write(val); - raw_spin_unlock_irqrestore(&pmu_lock, flags); + raw_spin_unlock_irqrestore(&events->pmu_lock, flags); } static void @@ -611,6 +632,7 @@ armv6mpcore_pmu_disable_event(struct hw_perf_event *hwc, int idx) { unsigned long val, mask, flags, evt = 0; + struct pmu_hw_events *events = cpu_pmu->get_hw_events(); if (ARMV6_CYCLE_COUNTER == idx) { mask = ARMV6_PMCR_CCOUNT_IEN; @@ -627,15 +649,21 @@ armv6mpcore_pmu_disable_event(struct hw_perf_event *hwc, * Unlike UP ARMv6, we don't have a way of stopping the counters. We * simply disable the interrupt reporting. */ - raw_spin_lock_irqsave(&pmu_lock, flags); + raw_spin_lock_irqsave(&events->pmu_lock, flags); val = armv6_pmcr_read(); val &= ~mask; val |= evt; armv6_pmcr_write(val); - raw_spin_unlock_irqrestore(&pmu_lock, flags); + raw_spin_unlock_irqrestore(&events->pmu_lock, flags); +} + +static int armv6_map_event(struct perf_event *event) +{ + return map_cpu_event(event, &armv6_perf_map, + &armv6_perf_cache_map, 0xFF); } -static const struct arm_pmu armv6pmu = { +static struct arm_pmu armv6pmu = { .id = ARM_PERF_PMU_ID_V6, .name = "v6", .handle_irq = armv6pmu_handle_irq, @@ -646,14 +674,12 @@ static const struct arm_pmu armv6pmu = { .get_event_idx = armv6pmu_get_event_idx, .start = armv6pmu_start, .stop = armv6pmu_stop, - .cache_map = &armv6_perf_cache_map, - .event_map = &armv6_perf_map, - .raw_event_mask = 0xFF, + .map_event = armv6_map_event, .num_events = 3, .max_period = (1LLU << 32) - 1, }; -static const struct arm_pmu *__init armv6pmu_init(void) +static struct arm_pmu *__init armv6pmu_init(void) { return &armv6pmu; } @@ -665,7 +691,14 @@ static const struct arm_pmu *__init armv6pmu_init(void) * disable the interrupt reporting and update the event. When unthrottling we * reset the period and enable the interrupt reporting. */ -static const struct arm_pmu armv6mpcore_pmu = { + +static int armv6mpcore_map_event(struct perf_event *event) +{ + return map_cpu_event(event, &armv6mpcore_perf_map, + &armv6mpcore_perf_cache_map, 0xFF); +} + +static struct arm_pmu armv6mpcore_pmu = { .id = ARM_PERF_PMU_ID_V6MP, .name = "v6mpcore", .handle_irq = armv6pmu_handle_irq, @@ -676,24 +709,22 @@ static const struct arm_pmu armv6mpcore_pmu = { .get_event_idx = armv6pmu_get_event_idx, .start = armv6pmu_start, .stop = armv6pmu_stop, - .cache_map = &armv6mpcore_perf_cache_map, - .event_map = &armv6mpcore_perf_map, - .raw_event_mask = 0xFF, + .map_event = armv6mpcore_map_event, .num_events = 3, .max_period = (1LLU << 32) - 1, }; -static const struct arm_pmu *__init armv6mpcore_pmu_init(void) +static struct arm_pmu *__init armv6mpcore_pmu_init(void) { return &armv6mpcore_pmu; } #else -static const struct arm_pmu *__init armv6pmu_init(void) +static struct arm_pmu *__init armv6pmu_init(void) { return NULL; } -static const struct arm_pmu *__init armv6mpcore_pmu_init(void) +static struct arm_pmu *__init armv6mpcore_pmu_init(void) { return NULL; } diff --git a/arch/arm/kernel/perf_event_v7.c b/arch/arm/kernel/perf_event_v7.c index 4c851834f68e..98b75738345e 100644 --- a/arch/arm/kernel/perf_event_v7.c +++ b/arch/arm/kernel/perf_event_v7.c @@ -17,6 +17,9 @@ */ #ifdef CONFIG_CPU_V7 + +static struct arm_pmu armv7pmu; + /* * Common ARMv7 event types * @@ -676,23 +679,24 @@ static const unsigned armv7_a15_perf_cache_map[PERF_COUNT_HW_CACHE_MAX] }; /* - * Perf Events counters + * Perf Events' indices */ -enum armv7_counters { - ARMV7_CYCLE_COUNTER = 1, /* Cycle counter */ - ARMV7_COUNTER0 = 2, /* First event counter */ -}; +#define ARMV7_IDX_CYCLE_COUNTER 0 +#define ARMV7_IDX_COUNTER0 1 +#define ARMV7_IDX_COUNTER_LAST (ARMV7_IDX_CYCLE_COUNTER + cpu_pmu->num_events - 1) + +#define ARMV7_MAX_COUNTERS 32 +#define ARMV7_COUNTER_MASK (ARMV7_MAX_COUNTERS - 1) /* - * The cycle counter is ARMV7_CYCLE_COUNTER. - * The first event counter is ARMV7_COUNTER0. - * The last event counter is (ARMV7_COUNTER0 + armpmu->num_events - 1). + * ARMv7 low level PMNC access */ -#define ARMV7_COUNTER_LAST (ARMV7_COUNTER0 + armpmu->num_events - 1) /* - * ARMv7 low level PMNC access + * Perf Event to low level counters mapping */ +#define ARMV7_IDX_TO_COUNTER(x) \ + (((x) - ARMV7_IDX_COUNTER0) & ARMV7_COUNTER_MASK) /* * Per-CPU PMNC: config reg @@ -708,103 +712,76 @@ enum armv7_counters { #define ARMV7_PMNC_MASK 0x3f /* Mask for writable bits */ /* - * Available counters - */ -#define ARMV7_CNT0 0 /* First event counter */ -#define ARMV7_CCNT 31 /* Cycle counter */ - -/* Perf Event to low level counters mapping */ -#define ARMV7_EVENT_CNT_TO_CNTx (ARMV7_COUNTER0 - ARMV7_CNT0) - -/* - * CNTENS: counters enable reg - */ -#define ARMV7_CNTENS_P(idx) (1 << (idx - ARMV7_EVENT_CNT_TO_CNTx)) -#define ARMV7_CNTENS_C (1 << ARMV7_CCNT) - -/* - * CNTENC: counters disable reg - */ -#define ARMV7_CNTENC_P(idx) (1 << (idx - ARMV7_EVENT_CNT_TO_CNTx)) -#define ARMV7_CNTENC_C (1 << ARMV7_CCNT) - -/* - * INTENS: counters overflow interrupt enable reg - */ -#define ARMV7_INTENS_P(idx) (1 << (idx - ARMV7_EVENT_CNT_TO_CNTx)) -#define ARMV7_INTENS_C (1 << ARMV7_CCNT) - -/* - * INTENC: counters overflow interrupt disable reg - */ -#define ARMV7_INTENC_P(idx) (1 << (idx - ARMV7_EVENT_CNT_TO_CNTx)) -#define ARMV7_INTENC_C (1 << ARMV7_CCNT) - -/* - * EVTSEL: Event selection reg + * FLAG: counters overflow flag status reg */ -#define ARMV7_EVTSEL_MASK 0xff /* Mask for writable bits */ +#define ARMV7_FLAG_MASK 0xffffffff /* Mask for writable bits */ +#define ARMV7_OVERFLOWED_MASK ARMV7_FLAG_MASK /* - * SELECT: Counter selection reg + * PMXEVTYPER: Event selection reg */ -#define ARMV7_SELECT_MASK 0x1f /* Mask for writable bits */ +#define ARMV7_EVTYPE_MASK 0xc00000ff /* Mask for writable bits */ +#define ARMV7_EVTYPE_EVENT 0xff /* Mask for EVENT bits */ /* - * FLAG: counters overflow flag status reg + * Event filters for PMUv2 */ -#define ARMV7_FLAG_P(idx) (1 << (idx - ARMV7_EVENT_CNT_TO_CNTx)) -#define ARMV7_FLAG_C (1 << ARMV7_CCNT) -#define ARMV7_FLAG_MASK 0xffffffff /* Mask for writable bits */ -#define ARMV7_OVERFLOWED_MASK ARMV7_FLAG_MASK +#define ARMV7_EXCLUDE_PL1 (1 << 31) +#define ARMV7_EXCLUDE_USER (1 << 30) +#define ARMV7_INCLUDE_HYP (1 << 27) -static inline unsigned long armv7_pmnc_read(void) +static inline u32 armv7_pmnc_read(void) { u32 val; asm volatile("mrc p15, 0, %0, c9, c12, 0" : "=r"(val)); return val; } -static inline void armv7_pmnc_write(unsigned long val) +static inline void armv7_pmnc_write(u32 val) { val &= ARMV7_PMNC_MASK; isb(); asm volatile("mcr p15, 0, %0, c9, c12, 0" : : "r"(val)); } -static inline int armv7_pmnc_has_overflowed(unsigned long pmnc) +static inline int armv7_pmnc_has_overflowed(u32 pmnc) { return pmnc & ARMV7_OVERFLOWED_MASK; } -static inline int armv7_pmnc_counter_has_overflowed(unsigned long pmnc, - enum armv7_counters counter) +static inline int armv7_pmnc_counter_valid(int idx) +{ + return idx >= ARMV7_IDX_CYCLE_COUNTER && idx <= ARMV7_IDX_COUNTER_LAST; +} + +static inline int armv7_pmnc_counter_has_overflowed(u32 pmnc, int idx) { int ret = 0; + u32 counter; - if (counter == ARMV7_CYCLE_COUNTER) - ret = pmnc & ARMV7_FLAG_C; - else if ((counter >= ARMV7_COUNTER0) && (counter <= ARMV7_COUNTER_LAST)) - ret = pmnc & ARMV7_FLAG_P(counter); - else + if (!armv7_pmnc_counter_valid(idx)) { pr_err("CPU%u checking wrong counter %d overflow status\n", - smp_processor_id(), counter); + smp_processor_id(), idx); + } else { + counter = ARMV7_IDX_TO_COUNTER(idx); + ret = pmnc & BIT(counter); + } return ret; } -static inline int armv7_pmnc_select_counter(unsigned int idx) +static inline int armv7_pmnc_select_counter(int idx) { - u32 val; + u32 counter; - if ((idx < ARMV7_COUNTER0) || (idx > ARMV7_COUNTER_LAST)) { - pr_err("CPU%u selecting wrong PMNC counter" - " %d\n", smp_processor_id(), idx); - return -1; + if (!armv7_pmnc_counter_valid(idx)) { + pr_err("CPU%u selecting wrong PMNC counter %d\n", + smp_processor_id(), idx); + return -EINVAL; } - val = (idx - ARMV7_EVENT_CNT_TO_CNTx) & ARMV7_SELECT_MASK; - asm volatile("mcr p15, 0, %0, c9, c12, 5" : : "r" (val)); + counter = ARMV7_IDX_TO_COUNTER(idx); + asm volatile("mcr p15, 0, %0, c9, c12, 5" : : "r" (counter)); isb(); return idx; @@ -812,124 +789,95 @@ static inline int armv7_pmnc_select_counter(unsigned int idx) static inline u32 armv7pmu_read_counter(int idx) { - unsigned long value = 0; + u32 value = 0; - if (idx == ARMV7_CYCLE_COUNTER) - asm volatile("mrc p15, 0, %0, c9, c13, 0" : "=r" (value)); - else if ((idx >= ARMV7_COUNTER0) && (idx <= ARMV7_COUNTER_LAST)) { - if (armv7_pmnc_select_counter(idx) == idx) - asm volatile("mrc p15, 0, %0, c9, c13, 2" - : "=r" (value)); - } else + if (!armv7_pmnc_counter_valid(idx)) pr_err("CPU%u reading wrong counter %d\n", smp_processor_id(), idx); + else if (idx == ARMV7_IDX_CYCLE_COUNTER) + asm volatile("mrc p15, 0, %0, c9, c13, 0" : "=r" (value)); + else if (armv7_pmnc_select_counter(idx) == idx) + asm volatile("mrc p15, 0, %0, c9, c13, 2" : "=r" (value)); return value; } static inline void armv7pmu_write_counter(int idx, u32 value) { - if (idx == ARMV7_CYCLE_COUNTER) - asm volatile("mcr p15, 0, %0, c9, c13, 0" : : "r" (value)); - else if ((idx >= ARMV7_COUNTER0) && (idx <= ARMV7_COUNTER_LAST)) { - if (armv7_pmnc_select_counter(idx) == idx) - asm volatile("mcr p15, 0, %0, c9, c13, 2" - : : "r" (value)); - } else + if (!armv7_pmnc_counter_valid(idx)) pr_err("CPU%u writing wrong counter %d\n", smp_processor_id(), idx); + else if (idx == ARMV7_IDX_CYCLE_COUNTER) + asm volatile("mcr p15, 0, %0, c9, c13, 0" : : "r" (value)); + else if (armv7_pmnc_select_counter(idx) == idx) + asm volatile("mcr p15, 0, %0, c9, c13, 2" : : "r" (value)); } -static inline void armv7_pmnc_write_evtsel(unsigned int idx, u32 val) +static inline void armv7_pmnc_write_evtsel(int idx, u32 val) { if (armv7_pmnc_select_counter(idx) == idx) { - val &= ARMV7_EVTSEL_MASK; + val &= ARMV7_EVTYPE_MASK; asm volatile("mcr p15, 0, %0, c9, c13, 1" : : "r" (val)); } } -static inline u32 armv7_pmnc_enable_counter(unsigned int idx) +static inline int armv7_pmnc_enable_counter(int idx) { - u32 val; + u32 counter; - if ((idx != ARMV7_CYCLE_COUNTER) && - ((idx < ARMV7_COUNTER0) || (idx > ARMV7_COUNTER_LAST))) { - pr_err("CPU%u enabling wrong PMNC counter" - " %d\n", smp_processor_id(), idx); - return -1; + if (!armv7_pmnc_counter_valid(idx)) { + pr_err("CPU%u enabling wrong PMNC counter %d\n", + smp_processor_id(), idx); + return -EINVAL; } - if (idx == ARMV7_CYCLE_COUNTER) - val = ARMV7_CNTENS_C; - else - val = ARMV7_CNTENS_P(idx); - - asm volatile("mcr p15, 0, %0, c9, c12, 1" : : "r" (val)); - + counter = ARMV7_IDX_TO_COUNTER(idx); + asm volatile("mcr p15, 0, %0, c9, c12, 1" : : "r" (BIT(counter))); return idx; } -static inline u32 armv7_pmnc_disable_counter(unsigned int idx) +static inline int armv7_pmnc_disable_counter(int idx) { - u32 val; - + u32 counter; - if ((idx != ARMV7_CYCLE_COUNTER) && - ((idx < ARMV7_COUNTER0) || (idx > ARMV7_COUNTER_LAST))) { - pr_err("CPU%u disabling wrong PMNC counter" - " %d\n", smp_processor_id(), idx); - return -1; + if (!armv7_pmnc_counter_valid(idx)) { + pr_err("CPU%u disabling wrong PMNC counter %d\n", + smp_processor_id(), idx); + return -EINVAL; } - if (idx == ARMV7_CYCLE_COUNTER) - val = ARMV7_CNTENC_C; - else - val = ARMV7_CNTENC_P(idx); - - asm volatile("mcr p15, 0, %0, c9, c12, 2" : : "r" (val)); - + counter = ARMV7_IDX_TO_COUNTER(idx); + asm volatile("mcr p15, 0, %0, c9, c12, 2" : : "r" (BIT(counter))); return idx; } -static inline u32 armv7_pmnc_enable_intens(unsigned int idx) +static inline int armv7_pmnc_enable_intens(int idx) { - u32 val; + u32 counter; - if ((idx != ARMV7_CYCLE_COUNTER) && - ((idx < ARMV7_COUNTER0) || (idx > ARMV7_COUNTER_LAST))) { - pr_err("CPU%u enabling wrong PMNC counter" - " interrupt enable %d\n", smp_processor_id(), idx); - return -1; + if (!armv7_pmnc_counter_valid(idx)) { + pr_err("CPU%u enabling wrong PMNC counter IRQ enable %d\n", + smp_processor_id(), idx); + return -EINVAL; } - if (idx == ARMV7_CYCLE_COUNTER) - val = ARMV7_INTENS_C; - else - val = ARMV7_INTENS_P(idx); - - asm volatile("mcr p15, 0, %0, c9, c14, 1" : : "r" (val)); - + counter = ARMV7_IDX_TO_COUNTER(idx); + asm volatile("mcr p15, 0, %0, c9, c14, 1" : : "r" (BIT(counter))); return idx; } -static inline u32 armv7_pmnc_disable_intens(unsigned int idx) +static inline int armv7_pmnc_disable_intens(int idx) { - u32 val; + u32 counter; - if ((idx != ARMV7_CYCLE_COUNTER) && - ((idx < ARMV7_COUNTER0) || (idx > ARMV7_COUNTER_LAST))) { - pr_err("CPU%u disabling wrong PMNC counter" - " interrupt enable %d\n", smp_processor_id(), idx); - return -1; + if (!armv7_pmnc_counter_valid(idx)) { + pr_err("CPU%u disabling wrong PMNC counter IRQ enable %d\n", + smp_processor_id(), idx); + return -EINVAL; } - if (idx == ARMV7_CYCLE_COUNTER) - val = ARMV7_INTENC_C; - else - val = ARMV7_INTENC_P(idx); - - asm volatile("mcr p15, 0, %0, c9, c14, 2" : : "r" (val)); - + counter = ARMV7_IDX_TO_COUNTER(idx); + asm volatile("mcr p15, 0, %0, c9, c14, 2" : : "r" (BIT(counter))); return idx; } @@ -973,14 +921,14 @@ static void armv7_pmnc_dump_regs(void) asm volatile("mrc p15, 0, %0, c9, c13, 0" : "=r" (val)); printk(KERN_INFO "CCNT =0x%08x\n", val); - for (cnt = ARMV7_COUNTER0; cnt < ARMV7_COUNTER_LAST; cnt++) { + for (cnt = ARMV7_IDX_COUNTER0; cnt <= ARMV7_IDX_COUNTER_LAST; cnt++) { armv7_pmnc_select_counter(cnt); asm volatile("mrc p15, 0, %0, c9, c13, 2" : "=r" (val)); printk(KERN_INFO "CNT[%d] count =0x%08x\n", - cnt-ARMV7_EVENT_CNT_TO_CNTx, val); + ARMV7_IDX_TO_COUNTER(cnt), val); asm volatile("mrc p15, 0, %0, c9, c13, 1" : "=r" (val)); printk(KERN_INFO "CNT[%d] evtsel=0x%08x\n", - cnt-ARMV7_EVENT_CNT_TO_CNTx, val); + ARMV7_IDX_TO_COUNTER(cnt), val); } } #endif @@ -988,12 +936,13 @@ static void armv7_pmnc_dump_regs(void) static void armv7pmu_enable_event(struct hw_perf_event *hwc, int idx) { unsigned long flags; + struct pmu_hw_events *events = cpu_pmu->get_hw_events(); /* * Enable counter and interrupt, and set the counter to count * the event that we're interested in. */ - raw_spin_lock_irqsave(&pmu_lock, flags); + raw_spin_lock_irqsave(&events->pmu_lock, flags); /* * Disable counter @@ -1002,9 +951,10 @@ static void armv7pmu_enable_event(struct hw_perf_event *hwc, int idx) /* * Set event (if destined for PMNx counters) - * We don't need to set the event if it's a cycle count + * We only need to set the event for the cycle counter if we + * have the ability to perform event filtering. */ - if (idx != ARMV7_CYCLE_COUNTER) + if (armv7pmu.set_event_filter || idx != ARMV7_IDX_CYCLE_COUNTER) armv7_pmnc_write_evtsel(idx, hwc->config_base); /* @@ -1017,17 +967,18 @@ static void armv7pmu_enable_event(struct hw_perf_event *hwc, int idx) */ armv7_pmnc_enable_counter(idx); - raw_spin_unlock_irqrestore(&pmu_lock, flags); + raw_spin_unlock_irqrestore(&events->pmu_lock, flags); } static void armv7pmu_disable_event(struct hw_perf_event *hwc, int idx) { unsigned long flags; + struct pmu_hw_events *events = cpu_pmu->get_hw_events(); /* * Disable counter and interrupt */ - raw_spin_lock_irqsave(&pmu_lock, flags); + raw_spin_lock_irqsave(&events->pmu_lock, flags); /* * Disable counter @@ -1039,14 +990,14 @@ static void armv7pmu_disable_event(struct hw_perf_event *hwc, int idx) */ armv7_pmnc_disable_intens(idx); - raw_spin_unlock_irqrestore(&pmu_lock, flags); + raw_spin_unlock_irqrestore(&events->pmu_lock, flags); } static irqreturn_t armv7pmu_handle_irq(int irq_num, void *dev) { - unsigned long pmnc; + u32 pmnc; struct perf_sample_data data; - struct cpu_hw_events *cpuc; + struct pmu_hw_events *cpuc; struct pt_regs *regs; int idx; @@ -1069,13 +1020,10 @@ static irqreturn_t armv7pmu_handle_irq(int irq_num, void *dev) perf_sample_data_init(&data, 0); cpuc = &__get_cpu_var(cpu_hw_events); - for (idx = 0; idx <= armpmu->num_events; ++idx) { + for (idx = 0; idx < cpu_pmu->num_events; ++idx) { struct perf_event *event = cpuc->events[idx]; struct hw_perf_event *hwc; - if (!test_bit(idx, cpuc->active_mask)) - continue; - /* * We have a single interrupt for all counters. Check that * each counter has overflowed before we process it. @@ -1090,7 +1038,7 @@ static irqreturn_t armv7pmu_handle_irq(int irq_num, void *dev) continue; if (perf_event_overflow(event, &data, regs)) - armpmu->disable(hwc, idx); + cpu_pmu->disable(hwc, idx); } /* @@ -1108,61 +1056,114 @@ static irqreturn_t armv7pmu_handle_irq(int irq_num, void *dev) static void armv7pmu_start(void) { unsigned long flags; + struct pmu_hw_events *events = cpu_pmu->get_hw_events(); - raw_spin_lock_irqsave(&pmu_lock, flags); + raw_spin_lock_irqsave(&events->pmu_lock, flags); /* Enable all counters */ armv7_pmnc_write(armv7_pmnc_read() | ARMV7_PMNC_E); - raw_spin_unlock_irqrestore(&pmu_lock, flags); + raw_spin_unlock_irqrestore(&events->pmu_lock, flags); } static void armv7pmu_stop(void) { unsigned long flags; + struct pmu_hw_events *events = cpu_pmu->get_hw_events(); - raw_spin_lock_irqsave(&pmu_lock, flags); + raw_spin_lock_irqsave(&events->pmu_lock, flags); /* Disable all counters */ armv7_pmnc_write(armv7_pmnc_read() & ~ARMV7_PMNC_E); - raw_spin_unlock_irqrestore(&pmu_lock, flags); + raw_spin_unlock_irqrestore(&events->pmu_lock, flags); } -static int armv7pmu_get_event_idx(struct cpu_hw_events *cpuc, +static int armv7pmu_get_event_idx(struct pmu_hw_events *cpuc, struct hw_perf_event *event) { int idx; + unsigned long evtype = event->config_base & ARMV7_EVTYPE_EVENT; /* Always place a cycle counter into the cycle counter. */ - if (event->config_base == ARMV7_PERFCTR_CPU_CYCLES) { - if (test_and_set_bit(ARMV7_CYCLE_COUNTER, cpuc->used_mask)) + if (evtype == ARMV7_PERFCTR_CPU_CYCLES) { + if (test_and_set_bit(ARMV7_IDX_CYCLE_COUNTER, cpuc->used_mask)) return -EAGAIN; - return ARMV7_CYCLE_COUNTER; - } else { - /* - * For anything other than a cycle counter, try and use - * the events counters - */ - for (idx = ARMV7_COUNTER0; idx <= armpmu->num_events; ++idx) { - if (!test_and_set_bit(idx, cpuc->used_mask)) - return idx; - } + return ARMV7_IDX_CYCLE_COUNTER; + } - /* The counters are all in use. */ - return -EAGAIN; + /* + * For anything other than a cycle counter, try and use + * the events counters + */ + for (idx = ARMV7_IDX_COUNTER0; idx < cpu_pmu->num_events; ++idx) { + if (!test_and_set_bit(idx, cpuc->used_mask)) + return idx; } + + /* The counters are all in use. */ + return -EAGAIN; +} + +/* + * Add an event filter to a given event. This will only work for PMUv2 PMUs. + */ +static int armv7pmu_set_event_filter(struct hw_perf_event *event, + struct perf_event_attr *attr) +{ + unsigned long config_base = 0; + + if (attr->exclude_idle) + return -EPERM; + if (attr->exclude_user) + config_base |= ARMV7_EXCLUDE_USER; + if (attr->exclude_kernel) + config_base |= ARMV7_EXCLUDE_PL1; + if (!attr->exclude_hv) + config_base |= ARMV7_INCLUDE_HYP; + + /* + * Install the filter into config_base as this is used to + * construct the event type. + */ + event->config_base = config_base; + + return 0; } static void armv7pmu_reset(void *info) { - u32 idx, nb_cnt = armpmu->num_events; + u32 idx, nb_cnt = cpu_pmu->num_events; /* The counter and interrupt enable registers are unknown at reset. */ - for (idx = 1; idx < nb_cnt; ++idx) + for (idx = ARMV7_IDX_CYCLE_COUNTER; idx < nb_cnt; ++idx) armv7pmu_disable_event(NULL, idx); /* Initialize & Reset PMNC: C and P bits */ armv7_pmnc_write(ARMV7_PMNC_P | ARMV7_PMNC_C); } +static int armv7_a8_map_event(struct perf_event *event) +{ + return map_cpu_event(event, &armv7_a8_perf_map, + &armv7_a8_perf_cache_map, 0xFF); +} + +static int armv7_a9_map_event(struct perf_event *event) +{ + return map_cpu_event(event, &armv7_a9_perf_map, + &armv7_a9_perf_cache_map, 0xFF); +} + +static int armv7_a5_map_event(struct perf_event *event) +{ + return map_cpu_event(event, &armv7_a5_perf_map, + &armv7_a5_perf_cache_map, 0xFF); +} + +static int armv7_a15_map_event(struct perf_event *event) +{ + return map_cpu_event(event, &armv7_a15_perf_map, + &armv7_a15_perf_cache_map, 0xFF); +} + static struct arm_pmu armv7pmu = { .handle_irq = armv7pmu_handle_irq, .enable = armv7pmu_enable_event, @@ -1173,7 +1174,6 @@ static struct arm_pmu armv7pmu = { .start = armv7pmu_start, .stop = armv7pmu_stop, .reset = armv7pmu_reset, - .raw_event_mask = 0xFF, .max_period = (1LLU << 32) - 1, }; @@ -1188,62 +1188,59 @@ static u32 __init armv7_read_num_pmnc_events(void) return nb_cnt + 1; } -static const struct arm_pmu *__init armv7_a8_pmu_init(void) +static struct arm_pmu *__init armv7_a8_pmu_init(void) { armv7pmu.id = ARM_PERF_PMU_ID_CA8; armv7pmu.name = "ARMv7 Cortex-A8"; - armv7pmu.cache_map = &armv7_a8_perf_cache_map; - armv7pmu.event_map = &armv7_a8_perf_map; + armv7pmu.map_event = armv7_a8_map_event; armv7pmu.num_events = armv7_read_num_pmnc_events(); return &armv7pmu; } -static const struct arm_pmu *__init armv7_a9_pmu_init(void) +static struct arm_pmu *__init armv7_a9_pmu_init(void) { armv7pmu.id = ARM_PERF_PMU_ID_CA9; armv7pmu.name = "ARMv7 Cortex-A9"; - armv7pmu.cache_map = &armv7_a9_perf_cache_map; - armv7pmu.event_map = &armv7_a9_perf_map; + armv7pmu.map_event = armv7_a9_map_event; armv7pmu.num_events = armv7_read_num_pmnc_events(); return &armv7pmu; } -static const struct arm_pmu *__init armv7_a5_pmu_init(void) +static struct arm_pmu *__init armv7_a5_pmu_init(void) { armv7pmu.id = ARM_PERF_PMU_ID_CA5; armv7pmu.name = "ARMv7 Cortex-A5"; - armv7pmu.cache_map = &armv7_a5_perf_cache_map; - armv7pmu.event_map = &armv7_a5_perf_map; + armv7pmu.map_event = armv7_a5_map_event; armv7pmu.num_events = armv7_read_num_pmnc_events(); return &armv7pmu; } -static const struct arm_pmu *__init armv7_a15_pmu_init(void) +static struct arm_pmu *__init armv7_a15_pmu_init(void) { armv7pmu.id = ARM_PERF_PMU_ID_CA15; armv7pmu.name = "ARMv7 Cortex-A15"; - armv7pmu.cache_map = &armv7_a15_perf_cache_map; - armv7pmu.event_map = &armv7_a15_perf_map; + armv7pmu.map_event = armv7_a15_map_event; armv7pmu.num_events = armv7_read_num_pmnc_events(); + armv7pmu.set_event_filter = armv7pmu_set_event_filter; return &armv7pmu; } #else -static const struct arm_pmu *__init armv7_a8_pmu_init(void) +static struct arm_pmu *__init armv7_a8_pmu_init(void) { return NULL; } -static const struct arm_pmu *__init armv7_a9_pmu_init(void) +static struct arm_pmu *__init armv7_a9_pmu_init(void) { return NULL; } -static const struct arm_pmu *__init armv7_a5_pmu_init(void) +static struct arm_pmu *__init armv7_a5_pmu_init(void) { return NULL; } -static const struct arm_pmu *__init armv7_a15_pmu_init(void) +static struct arm_pmu *__init armv7_a15_pmu_init(void) { return NULL; } diff --git a/arch/arm/kernel/perf_event_xscale.c b/arch/arm/kernel/perf_event_xscale.c index 3c4397491d08..e0cca10a8411 100644 --- a/arch/arm/kernel/perf_event_xscale.c +++ b/arch/arm/kernel/perf_event_xscale.c @@ -40,7 +40,7 @@ enum xscale_perf_types { }; enum xscale_counters { - XSCALE_CYCLE_COUNTER = 1, + XSCALE_CYCLE_COUNTER = 0, XSCALE_COUNTER0, XSCALE_COUNTER1, XSCALE_COUNTER2, @@ -222,7 +222,7 @@ xscale1pmu_handle_irq(int irq_num, void *dev) { unsigned long pmnc; struct perf_sample_data data; - struct cpu_hw_events *cpuc; + struct pmu_hw_events *cpuc; struct pt_regs *regs; int idx; @@ -249,13 +249,10 @@ xscale1pmu_handle_irq(int irq_num, void *dev) perf_sample_data_init(&data, 0); cpuc = &__get_cpu_var(cpu_hw_events); - for (idx = 0; idx <= armpmu->num_events; ++idx) { + for (idx = 0; idx < cpu_pmu->num_events; ++idx) { struct perf_event *event = cpuc->events[idx]; struct hw_perf_event *hwc; - if (!test_bit(idx, cpuc->active_mask)) - continue; - if (!xscale1_pmnc_counter_has_overflowed(pmnc, idx)) continue; @@ -266,7 +263,7 @@ xscale1pmu_handle_irq(int irq_num, void *dev) continue; if (perf_event_overflow(event, &data, regs)) - armpmu->disable(hwc, idx); + cpu_pmu->disable(hwc, idx); } irq_work_run(); @@ -284,6 +281,7 @@ static void xscale1pmu_enable_event(struct hw_perf_event *hwc, int idx) { unsigned long val, mask, evt, flags; + struct pmu_hw_events *events = cpu_pmu->get_hw_events(); switch (idx) { case XSCALE_CYCLE_COUNTER: @@ -305,18 +303,19 @@ xscale1pmu_enable_event(struct hw_perf_event *hwc, int idx) return; } - raw_spin_lock_irqsave(&pmu_lock, flags); + raw_spin_lock_irqsave(&events->pmu_lock, flags); val = xscale1pmu_read_pmnc(); val &= ~mask; val |= evt; xscale1pmu_write_pmnc(val); - raw_spin_unlock_irqrestore(&pmu_lock, flags); + raw_spin_unlock_irqrestore(&events->pmu_lock, flags); } static void xscale1pmu_disable_event(struct hw_perf_event *hwc, int idx) { unsigned long val, mask, evt, flags; + struct pmu_hw_events *events = cpu_pmu->get_hw_events(); switch (idx) { case XSCALE_CYCLE_COUNTER: @@ -336,16 +335,16 @@ xscale1pmu_disable_event(struct hw_perf_event *hwc, int idx) return; } - raw_spin_lock_irqsave(&pmu_lock, flags); + raw_spin_lock_irqsave(&events->pmu_lock, flags); val = xscale1pmu_read_pmnc(); val &= ~mask; val |= evt; xscale1pmu_write_pmnc(val); - raw_spin_unlock_irqrestore(&pmu_lock, flags); + raw_spin_unlock_irqrestore(&events->pmu_lock, flags); } static int -xscale1pmu_get_event_idx(struct cpu_hw_events *cpuc, +xscale1pmu_get_event_idx(struct pmu_hw_events *cpuc, struct hw_perf_event *event) { if (XSCALE_PERFCTR_CCNT == event->config_base) { @@ -368,24 +367,26 @@ static void xscale1pmu_start(void) { unsigned long flags, val; + struct pmu_hw_events *events = cpu_pmu->get_hw_events(); - raw_spin_lock_irqsave(&pmu_lock, flags); + raw_spin_lock_irqsave(&events->pmu_lock, flags); val = xscale1pmu_read_pmnc(); val |= XSCALE_PMU_ENABLE; xscale1pmu_write_pmnc(val); - raw_spin_unlock_irqrestore(&pmu_lock, flags); + raw_spin_unlock_irqrestore(&events->pmu_lock, flags); } static void xscale1pmu_stop(void) { unsigned long flags, val; + struct pmu_hw_events *events = cpu_pmu->get_hw_events(); - raw_spin_lock_irqsave(&pmu_lock, flags); + raw_spin_lock_irqsave(&events->pmu_lock, flags); val = xscale1pmu_read_pmnc(); val &= ~XSCALE_PMU_ENABLE; xscale1pmu_write_pmnc(val); - raw_spin_unlock_irqrestore(&pmu_lock, flags); + raw_spin_unlock_irqrestore(&events->pmu_lock, flags); } static inline u32 @@ -424,7 +425,13 @@ xscale1pmu_write_counter(int counter, u32 val) } } -static const struct arm_pmu xscale1pmu = { +static int xscale_map_event(struct perf_event *event) +{ + return map_cpu_event(event, &xscale_perf_map, + &xscale_perf_cache_map, 0xFF); +} + +static struct arm_pmu xscale1pmu = { .id = ARM_PERF_PMU_ID_XSCALE1, .name = "xscale1", .handle_irq = xscale1pmu_handle_irq, @@ -435,14 +442,12 @@ static const struct arm_pmu xscale1pmu = { .get_event_idx = xscale1pmu_get_event_idx, .start = xscale1pmu_start, .stop = xscale1pmu_stop, - .cache_map = &xscale_perf_cache_map, - .event_map = &xscale_perf_map, - .raw_event_mask = 0xFF, + .map_event = xscale_map_event, .num_events = 3, .max_period = (1LLU << 32) - 1, }; -static const struct arm_pmu *__init xscale1pmu_init(void) +static struct arm_pmu *__init xscale1pmu_init(void) { return &xscale1pmu; } @@ -560,7 +565,7 @@ xscale2pmu_handle_irq(int irq_num, void *dev) { unsigned long pmnc, of_flags; struct perf_sample_data data; - struct cpu_hw_events *cpuc; + struct pmu_hw_events *cpuc; struct pt_regs *regs; int idx; @@ -581,13 +586,10 @@ xscale2pmu_handle_irq(int irq_num, void *dev) perf_sample_data_init(&data, 0); cpuc = &__get_cpu_var(cpu_hw_events); - for (idx = 0; idx <= armpmu->num_events; ++idx) { + for (idx = 0; idx < cpu_pmu->num_events; ++idx) { struct perf_event *event = cpuc->events[idx]; struct hw_perf_event *hwc; - if (!test_bit(idx, cpuc->active_mask)) - continue; - if (!xscale2_pmnc_counter_has_overflowed(pmnc, idx)) continue; @@ -598,7 +600,7 @@ xscale2pmu_handle_irq(int irq_num, void *dev) continue; if (perf_event_overflow(event, &data, regs)) - armpmu->disable(hwc, idx); + cpu_pmu->disable(hwc, idx); } irq_work_run(); @@ -616,6 +618,7 @@ static void xscale2pmu_enable_event(struct hw_perf_event *hwc, int idx) { unsigned long flags, ien, evtsel; + struct pmu_hw_events *events = cpu_pmu->get_hw_events(); ien = xscale2pmu_read_int_enable(); evtsel = xscale2pmu_read_event_select(); @@ -649,16 +652,17 @@ xscale2pmu_enable_event(struct hw_perf_event *hwc, int idx) return; } - raw_spin_lock_irqsave(&pmu_lock, flags); + raw_spin_lock_irqsave(&events->pmu_lock, flags); xscale2pmu_write_event_select(evtsel); xscale2pmu_write_int_enable(ien); - raw_spin_unlock_irqrestore(&pmu_lock, flags); + raw_spin_unlock_irqrestore(&events->pmu_lock, flags); } static void xscale2pmu_disable_event(struct hw_perf_event *hwc, int idx) { unsigned long flags, ien, evtsel; + struct pmu_hw_events *events = cpu_pmu->get_hw_events(); ien = xscale2pmu_read_int_enable(); evtsel = xscale2pmu_read_event_select(); @@ -692,14 +696,14 @@ xscale2pmu_disable_event(struct hw_perf_event *hwc, int idx) return; } - raw_spin_lock_irqsave(&pmu_lock, flags); + raw_spin_lock_irqsave(&events->pmu_lock, flags); xscale2pmu_write_event_select(evtsel); xscale2pmu_write_int_enable(ien); - raw_spin_unlock_irqrestore(&pmu_lock, flags); + raw_spin_unlock_irqrestore(&events->pmu_lock, flags); } static int -xscale2pmu_get_event_idx(struct cpu_hw_events *cpuc, +xscale2pmu_get_event_idx(struct pmu_hw_events *cpuc, struct hw_perf_event *event) { int idx = xscale1pmu_get_event_idx(cpuc, event); @@ -718,24 +722,26 @@ static void xscale2pmu_start(void) { unsigned long flags, val; + struct pmu_hw_events *events = cpu_pmu->get_hw_events(); - raw_spin_lock_irqsave(&pmu_lock, flags); + raw_spin_lock_irqsave(&events->pmu_lock, flags); val = xscale2pmu_read_pmnc() & ~XSCALE_PMU_CNT64; val |= XSCALE_PMU_ENABLE; xscale2pmu_write_pmnc(val); - raw_spin_unlock_irqrestore(&pmu_lock, flags); + raw_spin_unlock_irqrestore(&events->pmu_lock, flags); } static void xscale2pmu_stop(void) { unsigned long flags, val; + struct pmu_hw_events *events = cpu_pmu->get_hw_events(); - raw_spin_lock_irqsave(&pmu_lock, flags); + raw_spin_lock_irqsave(&events->pmu_lock, flags); val = xscale2pmu_read_pmnc(); val &= ~XSCALE_PMU_ENABLE; xscale2pmu_write_pmnc(val); - raw_spin_unlock_irqrestore(&pmu_lock, flags); + raw_spin_unlock_irqrestore(&events->pmu_lock, flags); } static inline u32 @@ -786,7 +792,7 @@ xscale2pmu_write_counter(int counter, u32 val) } } -static const struct arm_pmu xscale2pmu = { +static struct arm_pmu xscale2pmu = { .id = ARM_PERF_PMU_ID_XSCALE2, .name = "xscale2", .handle_irq = xscale2pmu_handle_irq, @@ -797,24 +803,22 @@ static const struct arm_pmu xscale2pmu = { .get_event_idx = xscale2pmu_get_event_idx, .start = xscale2pmu_start, .stop = xscale2pmu_stop, - .cache_map = &xscale_perf_cache_map, - .event_map = &xscale_perf_map, - .raw_event_mask = 0xFF, + .map_event = xscale_map_event, .num_events = 5, .max_period = (1LLU << 32) - 1, }; -static const struct arm_pmu *__init xscale2pmu_init(void) +static struct arm_pmu *__init xscale2pmu_init(void) { return &xscale2pmu; } #else -static const struct arm_pmu *__init xscale1pmu_init(void) +static struct arm_pmu *__init xscale1pmu_init(void) { return NULL; } -static const struct arm_pmu *__init xscale2pmu_init(void) +static struct arm_pmu *__init xscale2pmu_init(void) { return NULL; } diff --git a/arch/arm/kernel/pmu.c b/arch/arm/kernel/pmu.c index c53474fe84df..2c3407ee8576 100644 --- a/arch/arm/kernel/pmu.c +++ b/arch/arm/kernel/pmu.c @@ -10,192 +10,26 @@ * */ -#define pr_fmt(fmt) "PMU: " fmt - -#include #include -#include #include #include -#include -#include #include -static volatile long pmu_lock; - -static struct platform_device *pmu_devices[ARM_NUM_PMU_DEVICES]; - -static int __devinit pmu_register(struct platform_device *pdev, - enum arm_pmu_type type) -{ - if (type < 0 || type >= ARM_NUM_PMU_DEVICES) { - pr_warning("received registration request for unknown " - "PMU device type %d\n", type); - return -EINVAL; - } - - if (pmu_devices[type]) { - pr_warning("rejecting duplicate registration of PMU device " - "type %d.", type); - return -ENOSPC; - } - - pr_info("registered new PMU device of type %d\n", type); - pmu_devices[type] = pdev; - return 0; -} - -#define OF_MATCH_PMU(_name, _type) { \ - .compatible = _name, \ - .data = (void *)_type, \ -} - -#define OF_MATCH_CPU(name) OF_MATCH_PMU(name, ARM_PMU_DEVICE_CPU) - -static struct of_device_id armpmu_of_device_ids[] = { - OF_MATCH_CPU("arm,cortex-a9-pmu"), - OF_MATCH_CPU("arm,cortex-a8-pmu"), - OF_MATCH_CPU("arm,arm1136-pmu"), - OF_MATCH_CPU("arm,arm1176-pmu"), - {}, -}; - -#define PLAT_MATCH_PMU(_name, _type) { \ - .name = _name, \ - .driver_data = _type, \ -} - -#define PLAT_MATCH_CPU(_name) PLAT_MATCH_PMU(_name, ARM_PMU_DEVICE_CPU) - -static struct platform_device_id armpmu_plat_device_ids[] = { - PLAT_MATCH_CPU("arm-pmu"), - {}, -}; - -enum arm_pmu_type armpmu_device_type(struct platform_device *pdev) -{ - const struct of_device_id *of_id; - const struct platform_device_id *pdev_id; - - /* provided by of_device_id table */ - if (pdev->dev.of_node) { - of_id = of_match_device(armpmu_of_device_ids, &pdev->dev); - BUG_ON(!of_id); - return (enum arm_pmu_type)of_id->data; - } - - /* Provided by platform_device_id table */ - pdev_id = platform_get_device_id(pdev); - BUG_ON(!pdev_id); - return pdev_id->driver_data; -} - -static int __devinit armpmu_device_probe(struct platform_device *pdev) -{ - return pmu_register(pdev, armpmu_device_type(pdev)); -} - -static struct platform_driver armpmu_driver = { - .driver = { - .name = "arm-pmu", - .of_match_table = armpmu_of_device_ids, - }, - .probe = armpmu_device_probe, - .id_table = armpmu_plat_device_ids, -}; - -static int __init register_pmu_driver(void) -{ - return platform_driver_register(&armpmu_driver); -} -device_initcall(register_pmu_driver); +/* + * PMU locking to ensure mutual exclusion between different subsystems. + */ +static unsigned long pmu_lock[BITS_TO_LONGS(ARM_NUM_PMU_DEVICES)]; -struct platform_device * +int reserve_pmu(enum arm_pmu_type type) { - struct platform_device *pdev; - - if (test_and_set_bit_lock(type, &pmu_lock)) { - pdev = ERR_PTR(-EBUSY); - } else if (pmu_devices[type] == NULL) { - clear_bit_unlock(type, &pmu_lock); - pdev = ERR_PTR(-ENODEV); - } else { - pdev = pmu_devices[type]; - } - - return pdev; + return test_and_set_bit_lock(type, pmu_lock) ? -EBUSY : 0; } EXPORT_SYMBOL_GPL(reserve_pmu); -int +void release_pmu(enum arm_pmu_type type) { - if (WARN_ON(!pmu_devices[type])) - return -EINVAL; - clear_bit_unlock(type, &pmu_lock); - return 0; -} -EXPORT_SYMBOL_GPL(release_pmu); - -static int -set_irq_affinity(int irq, - unsigned int cpu) -{ -#ifdef CONFIG_SMP - int err = irq_set_affinity(irq, cpumask_of(cpu)); - if (err) - pr_warning("unable to set irq affinity (irq=%d, cpu=%u)\n", - irq, cpu); - return err; -#else - return -EINVAL; -#endif -} - -static int -init_cpu_pmu(void) -{ - int i, irqs, err = 0; - struct platform_device *pdev = pmu_devices[ARM_PMU_DEVICE_CPU]; - - if (!pdev) - return -ENODEV; - - irqs = pdev->num_resources; - - /* - * If we have a single PMU interrupt that we can't shift, assume that - * we're running on a uniprocessor machine and continue. - */ - if (irqs == 1 && !irq_can_set_affinity(platform_get_irq(pdev, 0))) - return 0; - - for (i = 0; i < irqs; ++i) { - err = set_irq_affinity(platform_get_irq(pdev, i), i); - if (err) - break; - } - - return err; -} - -int -init_pmu(enum arm_pmu_type type) -{ - int err = 0; - - switch (type) { - case ARM_PMU_DEVICE_CPU: - err = init_cpu_pmu(); - break; - default: - pr_warning("attempt to initialise PMU of unknown " - "type %d\n", type); - err = -EINVAL; - } - - return err; + clear_bit_unlock(type, pmu_lock); } -EXPORT_SYMBOL_GPL(init_pmu); diff --git a/arch/arm/kernel/setup.c b/arch/arm/kernel/setup.c index e514c76043b4..6136144f8f8d 100644 --- a/arch/arm/kernel/setup.c +++ b/arch/arm/kernel/setup.c @@ -820,25 +820,8 @@ static struct machine_desc * __init setup_machine_tags(unsigned int nr) if (__atags_pointer) tags = phys_to_virt(__atags_pointer); - else if (mdesc->boot_params) { -#ifdef CONFIG_MMU - /* - * We still are executing with a minimal MMU mapping created - * with the presumption that the machine default for this - * is located in the first MB of RAM. Anything else will - * fault and silently hang the kernel at this point. - */ - if (mdesc->boot_params < PHYS_OFFSET || - mdesc->boot_params >= PHYS_OFFSET + SZ_1M) { - printk(KERN_WARNING - "Default boot params at physical 0x%08lx out of reach\n", - mdesc->boot_params); - } else -#endif - { - tags = phys_to_virt(mdesc->boot_params); - } - } + else if (mdesc->atag_offset) + tags = (void *)(PAGE_OFFSET + mdesc->atag_offset); #if defined(CONFIG_DEPRECATED_PARAM_STRUCT) /* diff --git a/arch/arm/mach-at91/at91sam9261.c b/arch/arm/mach-at91/at91sam9261.c index d522b47e30b5..6c8e3b5f669f 100644 --- a/arch/arm/mach-at91/at91sam9261.c +++ b/arch/arm/mach-at91/at91sam9261.c @@ -157,7 +157,7 @@ static struct clk_lookup periph_clocks_lookups[] = { CLKDEV_CON_DEV_ID("spi_clk", "atmel_spi.1", &spi1_clk), CLKDEV_CON_DEV_ID("t0_clk", "atmel_tcb.0", &tc0_clk), CLKDEV_CON_DEV_ID("t1_clk", "atmel_tcb.0", &tc1_clk), - CLKDEV_CON_DEV_ID("t2_clk", "atmel_tcb.0", &tc1_clk), + CLKDEV_CON_DEV_ID("t2_clk", "atmel_tcb.0", &tc2_clk), CLKDEV_CON_DEV_ID("pclk", "ssc.0", &ssc0_clk), CLKDEV_CON_DEV_ID("pclk", "ssc.1", &ssc1_clk), CLKDEV_CON_DEV_ID("pclk", "ssc.2", &ssc2_clk), diff --git a/arch/arm/mach-at91/at91sam9g45.c b/arch/arm/mach-at91/at91sam9g45.c index e04c5fb6f1ee..1532b508c814 100644 --- a/arch/arm/mach-at91/at91sam9g45.c +++ b/arch/arm/mach-at91/at91sam9g45.c @@ -12,6 +12,7 @@ #include #include +#include #include #include @@ -319,6 +320,7 @@ static void at91sam9g45_poweroff(void) static void __init at91sam9g45_map_io(void) { at91_init_sram(0, AT91SAM9G45_SRAM_BASE, AT91SAM9G45_SRAM_SIZE); + init_consistent_dma_size(SZ_4M); } static void __init at91sam9g45_initialize(void) diff --git a/arch/arm/mach-at91/include/mach/at91sam9g45.h b/arch/arm/mach-at91/include/mach/at91sam9g45.h index 2c611b9a0138..406bb6496805 100644 --- a/arch/arm/mach-at91/include/mach/at91sam9g45.h +++ b/arch/arm/mach-at91/include/mach/at91sam9g45.h @@ -128,8 +128,6 @@ #define AT91SAM9G45_EHCI_BASE 0x00800000 /* USB Host controller (EHCI) */ #define AT91SAM9G45_VDEC_BASE 0x00900000 /* Video Decoder Controller */ -#define CONSISTENT_DMA_SIZE SZ_4M - /* * DMA peripheral identifiers * for hardware handshaking interface diff --git a/arch/arm/mach-bcmring/include/mach/memory.h b/arch/arm/mach-bcmring/include/mach/memory.h index 15162e4c75f9..8848a5bb3445 100644 --- a/arch/arm/mach-bcmring/include/mach/memory.h +++ b/arch/arm/mach-bcmring/include/mach/memory.h @@ -25,9 +25,4 @@ #define PLAT_PHYS_OFFSET CFG_GLOBAL_RAM_BASE -/* - * Maximum DMA memory allowed is 14M - */ -#define CONSISTENT_DMA_SIZE (SZ_16M - SZ_2M) - #endif diff --git a/arch/arm/mach-bcmring/mm.c b/arch/arm/mach-bcmring/mm.c index 0f1c37e4523a..8616876abb9f 100644 --- a/arch/arm/mach-bcmring/mm.c +++ b/arch/arm/mach-bcmring/mm.c @@ -13,6 +13,7 @@ *****************************************************************************/ #include +#include #include #include @@ -53,4 +54,6 @@ void __init bcmring_map_io(void) { iotable_init(bcmring_io_desc, ARRAY_SIZE(bcmring_io_desc)); + /* Maximum DMA memory allowed is 14M */ + init_consistent_dma_size(14 << 20); } diff --git a/arch/arm/mach-clps711x/autcpu12.c b/arch/arm/mach-clps711x/autcpu12.c index 4a74b2c959bd..0276091b7f86 100644 --- a/arch/arm/mach-clps711x/autcpu12.c +++ b/arch/arm/mach-clps711x/autcpu12.c @@ -64,7 +64,7 @@ void __init autcpu12_map_io(void) MACHINE_START(AUTCPU12, "autronix autcpu12") /* Maintainer: Thomas Gleixner */ - .boot_params = 0xc0020000, + .atag_offset = 0x20000, .map_io = autcpu12_map_io, .init_irq = clps711x_init_irq, .timer = &clps711x_timer, diff --git a/arch/arm/mach-clps711x/cdb89712.c b/arch/arm/mach-clps711x/cdb89712.c index 5a1689d48793..25b3bfd0e85a 100644 --- a/arch/arm/mach-clps711x/cdb89712.c +++ b/arch/arm/mach-clps711x/cdb89712.c @@ -55,7 +55,7 @@ static void __init cdb89712_map_io(void) MACHINE_START(CDB89712, "Cirrus-CDB89712") /* Maintainer: Ray Lehtiniemi */ - .boot_params = 0xc0000100, + .atag_offset = 0x100, .map_io = cdb89712_map_io, .init_irq = clps711x_init_irq, .timer = &clps711x_timer, diff --git a/arch/arm/mach-clps711x/ceiva.c b/arch/arm/mach-clps711x/ceiva.c index 16481cf3e931..1df9ec67aa92 100644 --- a/arch/arm/mach-clps711x/ceiva.c +++ b/arch/arm/mach-clps711x/ceiva.c @@ -56,7 +56,7 @@ static void __init ceiva_map_io(void) MACHINE_START(CEIVA, "CEIVA/Polaroid Photo MAX Digital Picture Frame") /* Maintainer: Rob Scott */ - .boot_params = 0xc0000100, + .atag_offset = 0x100, .map_io = ceiva_map_io, .init_irq = clps711x_init_irq, .timer = &clps711x_timer, diff --git a/arch/arm/mach-clps711x/clep7312.c b/arch/arm/mach-clps711x/clep7312.c index 67b5abb4a60a..06c8abd9371f 100644 --- a/arch/arm/mach-clps711x/clep7312.c +++ b/arch/arm/mach-clps711x/clep7312.c @@ -37,7 +37,7 @@ fixup_clep7312(struct machine_desc *desc, struct tag *tags, MACHINE_START(CLEP7212, "Cirrus Logic 7212/7312") /* Maintainer: Nobody */ - .boot_params = 0xc0000100, + .atag_offset = 0x0100, .fixup = fixup_clep7312, .map_io = clps711x_map_io, .init_irq = clps711x_init_irq, diff --git a/arch/arm/mach-clps711x/edb7211-arch.c b/arch/arm/mach-clps711x/edb7211-arch.c index 98ca5b2e940d..abf522d1ec9b 100644 --- a/arch/arm/mach-clps711x/edb7211-arch.c +++ b/arch/arm/mach-clps711x/edb7211-arch.c @@ -57,7 +57,7 @@ fixup_edb7211(struct machine_desc *desc, struct tag *tags, MACHINE_START(EDB7211, "CL-EDB7211 (EP7211 eval board)") /* Maintainer: Jon McClintock */ - .boot_params = 0xc0020100, /* 0xc0000000 - 0xc001ffff can be video RAM */ + .atag_offset = 0x20100, /* 0xc0000000 - 0xc001ffff can be video RAM */ .fixup = fixup_edb7211, .map_io = edb7211_map_io, .reserve = edb7211_reserve, diff --git a/arch/arm/mach-clps711x/fortunet.c b/arch/arm/mach-clps711x/fortunet.c index b1cb479e71e9..b6f7d86bb1c9 100644 --- a/arch/arm/mach-clps711x/fortunet.c +++ b/arch/arm/mach-clps711x/fortunet.c @@ -75,7 +75,6 @@ fortunet_fixup(struct machine_desc *desc, struct tag *tags, MACHINE_START(FORTUNET, "ARM-FortuNet") /* Maintainer: FortuNet Inc. */ - .boot_params = 0x00000000, .fixup = fortunet_fixup, .map_io = clps711x_map_io, .init_irq = clps711x_init_irq, diff --git a/arch/arm/mach-clps711x/p720t.c b/arch/arm/mach-clps711x/p720t.c index cefbce0480b9..e7f75aeb1e5b 100644 --- a/arch/arm/mach-clps711x/p720t.c +++ b/arch/arm/mach-clps711x/p720t.c @@ -89,7 +89,7 @@ static void __init p720t_map_io(void) MACHINE_START(P720T, "ARM-Prospector720T") /* Maintainer: ARM Ltd/Deep Blue Solutions Ltd */ - .boot_params = 0xc0000100, + .atag_offset = 0x100, .fixup = fixup_p720t, .map_io = p720t_map_io, .init_irq = clps711x_init_irq, diff --git a/arch/arm/mach-cns3xxx/cns3420vb.c b/arch/arm/mach-cns3xxx/cns3420vb.c index 3e7d1496cb47..55f7b4b08ab9 100644 --- a/arch/arm/mach-cns3xxx/cns3420vb.c +++ b/arch/arm/mach-cns3xxx/cns3420vb.c @@ -197,7 +197,7 @@ static void __init cns3420_map_io(void) } MACHINE_START(CNS3420VB, "Cavium Networks CNS3420 Validation Board") - .boot_params = 0x00000100, + .atag_offset = 0x100, .map_io = cns3420_map_io, .init_irq = cns3xxx_init_irq, .timer = &cns3xxx_timer, diff --git a/arch/arm/mach-cns3xxx/include/mach/entry-macro.S b/arch/arm/mach-cns3xxx/include/mach/entry-macro.S index 6bd83ed90afe..d87bfc397d39 100644 --- a/arch/arm/mach-cns3xxx/include/mach/entry-macro.S +++ b/arch/arm/mach-cns3xxx/include/mach/entry-macro.S @@ -8,7 +8,6 @@ * published by the Free Software Foundation. */ -#include #include .macro disable_fiq diff --git a/arch/arm/mach-cns3xxx/include/mach/system.h b/arch/arm/mach-cns3xxx/include/mach/system.h index 58bb03ae3cf4..4f16c9b79f78 100644 --- a/arch/arm/mach-cns3xxx/include/mach/system.h +++ b/arch/arm/mach-cns3xxx/include/mach/system.h @@ -13,7 +13,6 @@ #include #include -#include static inline void arch_idle(void) { diff --git a/arch/arm/mach-cns3xxx/include/mach/uncompress.h b/arch/arm/mach-cns3xxx/include/mach/uncompress.h index de8ead9b91f7..a91b6058ab4f 100644 --- a/arch/arm/mach-cns3xxx/include/mach/uncompress.h +++ b/arch/arm/mach-cns3xxx/include/mach/uncompress.h @@ -8,7 +8,6 @@ */ #include -#include #include #define AMBA_UART_DR(base) (*(volatile unsigned char *)((base) + 0x00)) diff --git a/arch/arm/mach-cns3xxx/pcie.c b/arch/arm/mach-cns3xxx/pcie.c index 06fd25d70aec..0f8fca48a5ed 100644 --- a/arch/arm/mach-cns3xxx/pcie.c +++ b/arch/arm/mach-cns3xxx/pcie.c @@ -49,7 +49,7 @@ static struct cns3xxx_pcie *sysdata_to_cnspci(void *sysdata) return &cns3xxx_pcie[root->domain]; } -static struct cns3xxx_pcie *pdev_to_cnspci(struct pci_dev *dev) +static struct cns3xxx_pcie *pdev_to_cnspci(const struct pci_dev *dev) { return sysdata_to_cnspci(dev->sysdata); } diff --git a/arch/arm/mach-davinci/board-da830-evm.c b/arch/arm/mach-davinci/board-da830-evm.c index 84fd78684868..26d94c0b555c 100644 --- a/arch/arm/mach-davinci/board-da830-evm.c +++ b/arch/arm/mach-davinci/board-da830-evm.c @@ -676,7 +676,7 @@ static void __init da830_evm_map_io(void) } MACHINE_START(DAVINCI_DA830_EVM, "DaVinci DA830/OMAP-L137/AM17x EVM") - .boot_params = (DA8XX_DDR_BASE + 0x100), + .atag_offset = 0x100, .map_io = da830_evm_map_io, .init_irq = cp_intc_init, .timer = &davinci_timer, diff --git a/arch/arm/mach-davinci/board-da850-evm.c b/arch/arm/mach-davinci/board-da850-evm.c index bd5394537c88..6e41cb5baeb4 100644 --- a/arch/arm/mach-davinci/board-da850-evm.c +++ b/arch/arm/mach-davinci/board-da850-evm.c @@ -115,6 +115,32 @@ static struct spi_board_info da850evm_spi_info[] = { }, }; +#ifdef CONFIG_MTD +static void da850_evm_m25p80_notify_add(struct mtd_info *mtd) +{ + char *mac_addr = davinci_soc_info.emac_pdata->mac_addr; + size_t retlen; + + if (!strcmp(mtd->name, "MAC-Address")) { + mtd->read(mtd, 0, ETH_ALEN, &retlen, mac_addr); + if (retlen == ETH_ALEN) + pr_info("Read MAC addr from SPI Flash: %pM\n", + mac_addr); + } +} + +static struct mtd_notifier da850evm_spi_notifier = { + .add = da850_evm_m25p80_notify_add, +}; + +static void da850_evm_setup_mac_addr(void) +{ + register_mtd_user(&da850evm_spi_notifier); +} +#else +static void da850_evm_setup_mac_addr(void) { } +#endif + static struct mtd_partition da850_evm_norflash_partition[] = { { .name = "bootloaders + env", @@ -1244,6 +1270,8 @@ static __init void da850_evm_init(void) if (ret) pr_warning("da850_evm_init: sata registration failed: %d\n", ret); + + da850_evm_setup_mac_addr(); } #ifdef CONFIG_SERIAL_8250_CONSOLE @@ -1263,7 +1291,7 @@ static void __init da850_evm_map_io(void) } MACHINE_START(DAVINCI_DA850_EVM, "DaVinci DA850/OMAP-L138/AM18x EVM") - .boot_params = (DA8XX_DDR_BASE + 0x100), + .atag_offset = 0x100, .map_io = da850_evm_map_io, .init_irq = cp_intc_init, .timer = &davinci_timer, diff --git a/arch/arm/mach-davinci/board-dm355-evm.c b/arch/arm/mach-davinci/board-dm355-evm.c index 241a6bd67408..65566280b7c9 100644 --- a/arch/arm/mach-davinci/board-dm355-evm.c +++ b/arch/arm/mach-davinci/board-dm355-evm.c @@ -351,7 +351,7 @@ static __init void dm355_evm_init(void) } MACHINE_START(DAVINCI_DM355_EVM, "DaVinci DM355 EVM") - .boot_params = (0x80000100), + .atag_offset = 0x100, .map_io = dm355_evm_map_io, .init_irq = davinci_irq_init, .timer = &davinci_timer, diff --git a/arch/arm/mach-davinci/board-dm355-leopard.c b/arch/arm/mach-davinci/board-dm355-leopard.c index bee284ca7fd6..b307470b071d 100644 --- a/arch/arm/mach-davinci/board-dm355-leopard.c +++ b/arch/arm/mach-davinci/board-dm355-leopard.c @@ -270,7 +270,7 @@ static __init void dm355_leopard_init(void) } MACHINE_START(DM355_LEOPARD, "DaVinci DM355 leopard") - .boot_params = (0x80000100), + .atag_offset = 0x100, .map_io = dm355_leopard_map_io, .init_irq = davinci_irq_init, .timer = &davinci_timer, diff --git a/arch/arm/mach-davinci/board-dm365-evm.c b/arch/arm/mach-davinci/board-dm365-evm.c index 9818f214d4f0..04c43abcca66 100644 --- a/arch/arm/mach-davinci/board-dm365-evm.c +++ b/arch/arm/mach-davinci/board-dm365-evm.c @@ -612,7 +612,7 @@ static __init void dm365_evm_init(void) } MACHINE_START(DAVINCI_DM365_EVM, "DaVinci DM365 EVM") - .boot_params = (0x80000100), + .atag_offset = 0x100, .map_io = dm365_evm_map_io, .init_irq = davinci_irq_init, .timer = &davinci_timer, diff --git a/arch/arm/mach-davinci/board-dm644x-evm.c b/arch/arm/mach-davinci/board-dm644x-evm.c index 95607a191e03..a005e7691ddd 100644 --- a/arch/arm/mach-davinci/board-dm644x-evm.c +++ b/arch/arm/mach-davinci/board-dm644x-evm.c @@ -712,7 +712,7 @@ static __init void davinci_evm_init(void) MACHINE_START(DAVINCI_EVM, "DaVinci DM644x EVM") /* Maintainer: MontaVista Software */ - .boot_params = (DAVINCI_DDR_BASE + 0x100), + .atag_offset = 0x100, .map_io = davinci_evm_map_io, .init_irq = davinci_irq_init, .timer = &davinci_timer, diff --git a/arch/arm/mach-davinci/board-dm646x-evm.c b/arch/arm/mach-davinci/board-dm646x-evm.c index 993a3146fd35..337c45e3e44d 100644 --- a/arch/arm/mach-davinci/board-dm646x-evm.c +++ b/arch/arm/mach-davinci/board-dm646x-evm.c @@ -792,7 +792,7 @@ static __init void evm_init(void) } MACHINE_START(DAVINCI_DM6467_EVM, "DaVinci DM646x EVM") - .boot_params = (0x80000100), + .atag_offset = 0x100, .map_io = davinci_map_io, .init_irq = davinci_irq_init, .timer = &davinci_timer, @@ -801,7 +801,7 @@ MACHINE_START(DAVINCI_DM6467_EVM, "DaVinci DM646x EVM") MACHINE_END MACHINE_START(DAVINCI_DM6467TEVM, "DaVinci DM6467T EVM") - .boot_params = (0x80000100), + .atag_offset = 0x100, .map_io = davinci_map_io, .init_irq = davinci_irq_init, .timer = &davinci_timer, diff --git a/arch/arm/mach-davinci/board-mityomapl138.c b/arch/arm/mach-davinci/board-mityomapl138.c index c278226627ad..6efc84cceca0 100644 --- a/arch/arm/mach-davinci/board-mityomapl138.c +++ b/arch/arm/mach-davinci/board-mityomapl138.c @@ -566,7 +566,7 @@ static void __init mityomapl138_map_io(void) } MACHINE_START(MITYOMAPL138, "MityDSP-L138/MityARM-1808") - .boot_params = (DA8XX_DDR_BASE + 0x100), + .atag_offset = 0x100, .map_io = mityomapl138_map_io, .init_irq = cp_intc_init, .timer = &davinci_timer, diff --git a/arch/arm/mach-davinci/board-neuros-osd2.c b/arch/arm/mach-davinci/board-neuros-osd2.c index d60a80028ba3..38d6f644d8b9 100644 --- a/arch/arm/mach-davinci/board-neuros-osd2.c +++ b/arch/arm/mach-davinci/board-neuros-osd2.c @@ -272,7 +272,7 @@ static __init void davinci_ntosd2_init(void) MACHINE_START(NEUROS_OSD2, "Neuros OSD2") /* Maintainer: Neuros Technologies */ - .boot_params = (DAVINCI_DDR_BASE + 0x100), + .atag_offset = 0x100, .map_io = davinci_ntosd2_map_io, .init_irq = davinci_irq_init, .timer = &davinci_timer, diff --git a/arch/arm/mach-davinci/board-omapl138-hawk.c b/arch/arm/mach-davinci/board-omapl138-hawk.c index 237332a11421..c6701e4a795c 100644 --- a/arch/arm/mach-davinci/board-omapl138-hawk.c +++ b/arch/arm/mach-davinci/board-omapl138-hawk.c @@ -338,7 +338,7 @@ static void __init omapl138_hawk_map_io(void) } MACHINE_START(OMAPL138_HAWKBOARD, "AM18x/OMAP-L138 Hawkboard") - .boot_params = (DA8XX_DDR_BASE + 0x100), + .atag_offset = 0x100, .map_io = omapl138_hawk_map_io, .init_irq = cp_intc_init, .timer = &davinci_timer, diff --git a/arch/arm/mach-davinci/board-sffsdr.c b/arch/arm/mach-davinci/board-sffsdr.c index 5f4385c0a089..5dd4da9d2308 100644 --- a/arch/arm/mach-davinci/board-sffsdr.c +++ b/arch/arm/mach-davinci/board-sffsdr.c @@ -151,7 +151,7 @@ static __init void davinci_sffsdr_init(void) MACHINE_START(SFFSDR, "Lyrtech SFFSDR") /* Maintainer: Hugo Villeneuve hugo.villeneuve@lyrtech.com */ - .boot_params = (DAVINCI_DDR_BASE + 0x100), + .atag_offset = 0x100, .map_io = davinci_sffsdr_map_io, .init_irq = davinci_irq_init, .timer = &davinci_timer, diff --git a/arch/arm/mach-davinci/board-tnetv107x-evm.c b/arch/arm/mach-davinci/board-tnetv107x-evm.c index 782892065682..90ee7b5aabdc 100644 --- a/arch/arm/mach-davinci/board-tnetv107x-evm.c +++ b/arch/arm/mach-davinci/board-tnetv107x-evm.c @@ -277,7 +277,7 @@ console_initcall(tnetv107x_evm_console_init); #endif MACHINE_START(TNETV107X, "TNETV107X EVM") - .boot_params = (TNETV107X_DDR_BASE + 0x100), + .atag_offset = 0x100, .map_io = tnetv107x_init, .init_irq = cp_intc_init, .timer = &davinci_timer, diff --git a/arch/arm/mach-davinci/common.c b/arch/arm/mach-davinci/common.c index 1d2557394235..865ffe5899ac 100644 --- a/arch/arm/mach-davinci/common.c +++ b/arch/arm/mach-davinci/common.c @@ -12,6 +12,7 @@ #include #include #include +#include #include #include @@ -86,6 +87,8 @@ void __init davinci_common_init(struct davinci_soc_info *soc_info) iotable_init(davinci_soc_info.io_desc, davinci_soc_info.io_desc_num); + init_consistent_dma_size(14 << 20); + /* * Normally devicemaps_init() would flush caches and tlb after * mdesc->map_io(), but we must also do it here because of the CPU diff --git a/arch/arm/mach-davinci/include/mach/memory.h b/arch/arm/mach-davinci/include/mach/memory.h index 78731944a70c..885d23319668 100644 --- a/arch/arm/mach-davinci/include/mach/memory.h +++ b/arch/arm/mach-davinci/include/mach/memory.h @@ -36,9 +36,4 @@ #define DDR2_MCLKSTOPEN_BIT BIT(30) #define DDR2_LPMODEN_BIT BIT(31) -/* - * Increase size of DMA-consistent memory region - */ -#define CONSISTENT_DMA_SIZE (14<<20) - #endif /* __ASM_ARCH_MEMORY_H */ diff --git a/arch/arm/mach-davinci/include/mach/psc.h b/arch/arm/mach-davinci/include/mach/psc.h index 47fd0bc3d3e7..fa59c097223d 100644 --- a/arch/arm/mach-davinci/include/mach/psc.h +++ b/arch/arm/mach-davinci/include/mach/psc.h @@ -243,7 +243,7 @@ #define PSC_STATE_DISABLE 2 #define PSC_STATE_ENABLE 3 -#define MDSTAT_STATE_MASK 0x1f +#define MDSTAT_STATE_MASK 0x3f #define MDCTL_FORCE BIT(31) #ifndef __ASSEMBLER__ diff --git a/arch/arm/mach-davinci/sleep.S b/arch/arm/mach-davinci/sleep.S index fb5e72b532b0..5f1e045a3ad1 100644 --- a/arch/arm/mach-davinci/sleep.S +++ b/arch/arm/mach-davinci/sleep.S @@ -217,7 +217,11 @@ ddr2clk_stop_done: ENDPROC(davinci_ddr_psc_config) CACHE_FLUSH: - .word arm926_flush_kern_cache_all +#ifdef CONFIG_CPU_V6 + .word v6_flush_kern_cache_all +#else + .word arm926_flush_kern_cache_all +#endif ENTRY(davinci_cpu_suspend_sz) .word . - davinci_cpu_suspend diff --git a/arch/arm/mach-dove/cm-a510.c b/arch/arm/mach-dove/cm-a510.c index 03e11f9dca97..c8a406f7e946 100644 --- a/arch/arm/mach-dove/cm-a510.c +++ b/arch/arm/mach-dove/cm-a510.c @@ -87,7 +87,7 @@ static void __init cm_a510_init(void) } MACHINE_START(CM_A510, "Compulab CM-A510 Board") - .boot_params = 0x00000100, + .atag_offset = 0x100, .init_machine = cm_a510_init, .map_io = dove_map_io, .init_early = dove_init_early, diff --git a/arch/arm/mach-dove/dove-db-setup.c b/arch/arm/mach-dove/dove-db-setup.c index 2ac34ecfa745..11ea34e4fc76 100644 --- a/arch/arm/mach-dove/dove-db-setup.c +++ b/arch/arm/mach-dove/dove-db-setup.c @@ -94,7 +94,7 @@ static void __init dove_db_init(void) } MACHINE_START(DOVE_DB, "Marvell DB-MV88AP510-BP Development Board") - .boot_params = 0x00000100, + .atag_offset = 0x100, .init_machine = dove_db_init, .map_io = dove_map_io, .init_early = dove_init_early, diff --git a/arch/arm/mach-ebsa110/core.c b/arch/arm/mach-ebsa110/core.c index 087bc771ac23..d0ce8abdd4b6 100644 --- a/arch/arm/mach-ebsa110/core.c +++ b/arch/arm/mach-ebsa110/core.c @@ -280,7 +280,7 @@ arch_initcall(ebsa110_init); MACHINE_START(EBSA110, "EBSA110") /* Maintainer: Russell King */ - .boot_params = 0x00000400, + .atag_offset = 0x400, .reserve_lp0 = 1, .reserve_lp2 = 1, .soft_reboot = 1, diff --git a/arch/arm/mach-ep93xx/adssphere.c b/arch/arm/mach-ep93xx/adssphere.c index 61b98ce4b673..0713448206a5 100644 --- a/arch/arm/mach-ep93xx/adssphere.c +++ b/arch/arm/mach-ep93xx/adssphere.c @@ -33,7 +33,7 @@ static void __init adssphere_init_machine(void) MACHINE_START(ADSSPHERE, "ADS Sphere board") /* Maintainer: Lennert Buytenhek */ - .boot_params = EP93XX_SDCE3_PHYS_BASE_SYNC + 0x100, + .atag_offset = 0x100, .map_io = ep93xx_map_io, .init_irq = ep93xx_init_irq, .timer = &ep93xx_timer, diff --git a/arch/arm/mach-ep93xx/edb93xx.c b/arch/arm/mach-ep93xx/edb93xx.c index 9969bb115f60..257175edc575 100644 --- a/arch/arm/mach-ep93xx/edb93xx.c +++ b/arch/arm/mach-ep93xx/edb93xx.c @@ -240,7 +240,7 @@ static void __init edb93xx_init_machine(void) #ifdef CONFIG_MACH_EDB9301 MACHINE_START(EDB9301, "Cirrus Logic EDB9301 Evaluation Board") /* Maintainer: H Hartley Sweeten */ - .boot_params = EP93XX_SDCE3_PHYS_BASE_SYNC + 0x100, + .atag_offset = 0x100, .map_io = ep93xx_map_io, .init_irq = ep93xx_init_irq, .timer = &ep93xx_timer, @@ -251,7 +251,7 @@ MACHINE_END #ifdef CONFIG_MACH_EDB9302 MACHINE_START(EDB9302, "Cirrus Logic EDB9302 Evaluation Board") /* Maintainer: George Kashperko */ - .boot_params = EP93XX_SDCE3_PHYS_BASE_SYNC + 0x100, + .atag_offset = 0x100, .map_io = ep93xx_map_io, .init_irq = ep93xx_init_irq, .timer = &ep93xx_timer, @@ -262,7 +262,7 @@ MACHINE_END #ifdef CONFIG_MACH_EDB9302A MACHINE_START(EDB9302A, "Cirrus Logic EDB9302A Evaluation Board") /* Maintainer: Lennert Buytenhek */ - .boot_params = EP93XX_SDCE0_PHYS_BASE + 0x100, + .atag_offset = 0x100, .map_io = ep93xx_map_io, .init_irq = ep93xx_init_irq, .timer = &ep93xx_timer, @@ -273,7 +273,7 @@ MACHINE_END #ifdef CONFIG_MACH_EDB9307 MACHINE_START(EDB9307, "Cirrus Logic EDB9307 Evaluation Board") /* Maintainer: Herbert Valerio Riedel */ - .boot_params = EP93XX_SDCE3_PHYS_BASE_SYNC + 0x100, + .atag_offset = 0x100, .map_io = ep93xx_map_io, .init_irq = ep93xx_init_irq, .timer = &ep93xx_timer, @@ -284,7 +284,7 @@ MACHINE_END #ifdef CONFIG_MACH_EDB9307A MACHINE_START(EDB9307A, "Cirrus Logic EDB9307A Evaluation Board") /* Maintainer: H Hartley Sweeten */ - .boot_params = EP93XX_SDCE0_PHYS_BASE + 0x100, + .atag_offset = 0x100, .map_io = ep93xx_map_io, .init_irq = ep93xx_init_irq, .timer = &ep93xx_timer, @@ -295,7 +295,7 @@ MACHINE_END #ifdef CONFIG_MACH_EDB9312 MACHINE_START(EDB9312, "Cirrus Logic EDB9312 Evaluation Board") /* Maintainer: Toufeeq Hussain */ - .boot_params = EP93XX_SDCE3_PHYS_BASE_SYNC + 0x100, + .atag_offset = 0x100, .map_io = ep93xx_map_io, .init_irq = ep93xx_init_irq, .timer = &ep93xx_timer, @@ -306,7 +306,7 @@ MACHINE_END #ifdef CONFIG_MACH_EDB9315 MACHINE_START(EDB9315, "Cirrus Logic EDB9315 Evaluation Board") /* Maintainer: Lennert Buytenhek */ - .boot_params = EP93XX_SDCE3_PHYS_BASE_SYNC + 0x100, + .atag_offset = 0x100, .map_io = ep93xx_map_io, .init_irq = ep93xx_init_irq, .timer = &ep93xx_timer, @@ -317,7 +317,7 @@ MACHINE_END #ifdef CONFIG_MACH_EDB9315A MACHINE_START(EDB9315A, "Cirrus Logic EDB9315A Evaluation Board") /* Maintainer: Lennert Buytenhek */ - .boot_params = EP93XX_SDCE0_PHYS_BASE + 0x100, + .atag_offset = 0x100, .map_io = ep93xx_map_io, .init_irq = ep93xx_init_irq, .timer = &ep93xx_timer, diff --git a/arch/arm/mach-ep93xx/gesbc9312.c b/arch/arm/mach-ep93xx/gesbc9312.c index 9bd3152bff9a..45ee205856f8 100644 --- a/arch/arm/mach-ep93xx/gesbc9312.c +++ b/arch/arm/mach-ep93xx/gesbc9312.c @@ -33,7 +33,7 @@ static void __init gesbc9312_init_machine(void) MACHINE_START(GESBC9312, "Glomation GESBC-9312-sx") /* Maintainer: Lennert Buytenhek */ - .boot_params = EP93XX_SDCE3_PHYS_BASE_SYNC + 0x100, + .atag_offset = 0x100, .map_io = ep93xx_map_io, .init_irq = ep93xx_init_irq, .timer = &ep93xx_timer, diff --git a/arch/arm/mach-ep93xx/micro9.c b/arch/arm/mach-ep93xx/micro9.c index 7adea6258efe..e72f7368876e 100644 --- a/arch/arm/mach-ep93xx/micro9.c +++ b/arch/arm/mach-ep93xx/micro9.c @@ -77,7 +77,7 @@ static void __init micro9_init_machine(void) #ifdef CONFIG_MACH_MICRO9H MACHINE_START(MICRO9, "Contec Micro9-High") /* Maintainer: Hubert Feurstein */ - .boot_params = EP93XX_SDCE3_PHYS_BASE_SYNC + 0x100, + .atag_offset = 0x100, .map_io = ep93xx_map_io, .init_irq = ep93xx_init_irq, .timer = &ep93xx_timer, @@ -88,7 +88,7 @@ MACHINE_END #ifdef CONFIG_MACH_MICRO9M MACHINE_START(MICRO9M, "Contec Micro9-Mid") /* Maintainer: Hubert Feurstein */ - .boot_params = EP93XX_SDCE3_PHYS_BASE_ASYNC + 0x100, + .atag_offset = 0x100, .map_io = ep93xx_map_io, .init_irq = ep93xx_init_irq, .timer = &ep93xx_timer, @@ -99,7 +99,7 @@ MACHINE_END #ifdef CONFIG_MACH_MICRO9L MACHINE_START(MICRO9L, "Contec Micro9-Lite") /* Maintainer: Hubert Feurstein */ - .boot_params = EP93XX_SDCE3_PHYS_BASE_SYNC + 0x100, + .atag_offset = 0x100, .map_io = ep93xx_map_io, .init_irq = ep93xx_init_irq, .timer = &ep93xx_timer, @@ -110,7 +110,7 @@ MACHINE_END #ifdef CONFIG_MACH_MICRO9S MACHINE_START(MICRO9S, "Contec Micro9-Slim") /* Maintainer: Hubert Feurstein */ - .boot_params = EP93XX_SDCE3_PHYS_BASE_ASYNC + 0x100, + .atag_offset = 0x100, .map_io = ep93xx_map_io, .init_irq = ep93xx_init_irq, .timer = &ep93xx_timer, diff --git a/arch/arm/mach-ep93xx/simone.c b/arch/arm/mach-ep93xx/simone.c index 8392e95d7cea..238bc603da86 100644 --- a/arch/arm/mach-ep93xx/simone.c +++ b/arch/arm/mach-ep93xx/simone.c @@ -65,8 +65,8 @@ static void __init simone_init_machine(void) } MACHINE_START(SIM_ONE, "Simplemachines Sim.One Board") -/* Maintainer: Ryan Mallon */ - .boot_params = EP93XX_SDCE0_PHYS_BASE + 0x100, + /* Maintainer: Ryan Mallon */ + .atag_offset = 0x100, .map_io = ep93xx_map_io, .init_irq = ep93xx_init_irq, .timer = &ep93xx_timer, diff --git a/arch/arm/mach-ep93xx/snappercl15.c b/arch/arm/mach-ep93xx/snappercl15.c index 2e9c614757e4..3bdf3a2e5ad0 100644 --- a/arch/arm/mach-ep93xx/snappercl15.c +++ b/arch/arm/mach-ep93xx/snappercl15.c @@ -163,7 +163,7 @@ static void __init snappercl15_init_machine(void) MACHINE_START(SNAPPER_CL15, "Bluewater Systems Snapper CL15") /* Maintainer: Ryan Mallon */ - .boot_params = EP93XX_SDCE0_PHYS_BASE + 0x100, + .atag_offset = 0x100, .map_io = ep93xx_map_io, .init_irq = ep93xx_init_irq, .timer = &ep93xx_timer, diff --git a/arch/arm/mach-ep93xx/ts72xx.c b/arch/arm/mach-ep93xx/ts72xx.c index c2d2cf40ead9..1ade3c340507 100644 --- a/arch/arm/mach-ep93xx/ts72xx.c +++ b/arch/arm/mach-ep93xx/ts72xx.c @@ -257,7 +257,7 @@ static void __init ts72xx_init_machine(void) MACHINE_START(TS72XX, "Technologic Systems TS-72xx SBC") /* Maintainer: Lennert Buytenhek */ - .boot_params = EP93XX_SDCE3_PHYS_BASE_SYNC + 0x100, + .atag_offset = 0x100, .map_io = ts72xx_map_io, .init_irq = ep93xx_init_irq, .timer = &ep93xx_timer, diff --git a/arch/arm/mach-exynos4/clock.c b/arch/arm/mach-exynos4/clock.c index 851dea018578..1561b036a9bf 100644 --- a/arch/arm/mach-exynos4/clock.c +++ b/arch/arm/mach-exynos4/clock.c @@ -520,7 +520,7 @@ static struct clk init_clocks_off[] = { .ctrlbit = (1 << 21), }, { .name = "ac97", - .id = -1, + .devname = "samsung-ac97", .enable = exynos4_clk_ip_peril_ctrl, .ctrlbit = (1 << 27), }, { diff --git a/arch/arm/mach-exynos4/cpu.c b/arch/arm/mach-exynos4/cpu.c index 2d8a40c9e6e5..746d6fc6d397 100644 --- a/arch/arm/mach-exynos4/cpu.c +++ b/arch/arm/mach-exynos4/cpu.c @@ -24,12 +24,13 @@ #include #include #include -#include #include #include #include +#include #include +#include extern int combiner_init(unsigned int combiner_nr, void __iomem *base, unsigned int irq_start); @@ -128,6 +129,11 @@ static void exynos4_idle(void) local_irq_enable(); } +static void exynos4_sw_reset(void) +{ + __raw_writel(0x1, S5P_SWRESET); +} + /* * exynos4_map_io * @@ -241,5 +247,8 @@ int __init exynos4_init(void) /* set idle function */ pm_idle = exynos4_idle; + /* set sw_reset function */ + s5p_reset_hook = exynos4_sw_reset; + return sysdev_register(&exynos4_sysdev); } diff --git a/arch/arm/mach-exynos4/include/mach/irqs.h b/arch/arm/mach-exynos4/include/mach/irqs.h index 934d2a493982..f8952f8f3757 100644 --- a/arch/arm/mach-exynos4/include/mach/irqs.h +++ b/arch/arm/mach-exynos4/include/mach/irqs.h @@ -80,9 +80,8 @@ #define IRQ_HSMMC3 IRQ_SPI(76) #define IRQ_DWMCI IRQ_SPI(77) -#define IRQ_MIPICSI0 IRQ_SPI(78) - -#define IRQ_MIPICSI1 IRQ_SPI(80) +#define IRQ_MIPI_CSIS0 IRQ_SPI(78) +#define IRQ_MIPI_CSIS1 IRQ_SPI(80) #define IRQ_ONENAND_AUDI IRQ_SPI(82) #define IRQ_ROTATOR IRQ_SPI(83) diff --git a/arch/arm/mach-exynos4/include/mach/regs-pmu.h b/arch/arm/mach-exynos4/include/mach/regs-pmu.h index fa49bbb8e7b0..cdf9b47c303c 100644 --- a/arch/arm/mach-exynos4/include/mach/regs-pmu.h +++ b/arch/arm/mach-exynos4/include/mach/regs-pmu.h @@ -29,6 +29,8 @@ #define S5P_USE_STANDBY_WFE1 (1 << 25) #define S5P_USE_MASK ((0x3 << 16) | (0x3 << 24)) +#define S5P_SWRESET S5P_PMUREG(0x0400) + #define S5P_WAKEUP_STAT S5P_PMUREG(0x0600) #define S5P_EINT_WAKEUP_MASK S5P_PMUREG(0x0604) #define S5P_WAKEUP_MASK S5P_PMUREG(0x0608) diff --git a/arch/arm/mach-exynos4/irq-eint.c b/arch/arm/mach-exynos4/irq-eint.c index 9d87d2ac7f68..badb8c66fc9b 100644 --- a/arch/arm/mach-exynos4/irq-eint.c +++ b/arch/arm/mach-exynos4/irq-eint.c @@ -23,6 +23,8 @@ #include +#include + static DEFINE_SPINLOCK(eint_lock); static unsigned int eint0_15_data[16]; @@ -184,8 +186,11 @@ static inline void exynos4_irq_demux_eint(unsigned int start) static void exynos4_irq_demux_eint16_31(unsigned int irq, struct irq_desc *desc) { + struct irq_chip *chip = irq_get_chip(irq); + chained_irq_enter(chip, desc); exynos4_irq_demux_eint(IRQ_EINT(16)); exynos4_irq_demux_eint(IRQ_EINT(24)); + chained_irq_exit(chip, desc); } static void exynos4_irq_eint0_15(unsigned int irq, struct irq_desc *desc) @@ -193,6 +198,7 @@ static void exynos4_irq_eint0_15(unsigned int irq, struct irq_desc *desc) u32 *irq_data = irq_get_handler_data(irq); struct irq_chip *chip = irq_get_chip(irq); + chained_irq_enter(chip, desc); chip->irq_mask(&desc->irq_data); if (chip->irq_ack) @@ -201,6 +207,7 @@ static void exynos4_irq_eint0_15(unsigned int irq, struct irq_desc *desc) generic_handle_irq(*irq_data); chip->irq_unmask(&desc->irq_data); + chained_irq_exit(chip, desc); } int __init exynos4_init_irq_eint(void) diff --git a/arch/arm/mach-exynos4/mach-armlex4210.c b/arch/arm/mach-exynos4/mach-armlex4210.c index b482c6285fc4..f0ca6c157d29 100644 --- a/arch/arm/mach-exynos4/mach-armlex4210.c +++ b/arch/arm/mach-exynos4/mach-armlex4210.c @@ -207,7 +207,7 @@ static void __init armlex4210_machine_init(void) MACHINE_START(ARMLEX4210, "ARMLEX4210") /* Maintainer: Alim Akhtar */ - .boot_params = S5P_PA_SDRAM + 0x100, + .atag_offset = 0x100, .init_irq = exynos4_init_irq, .map_io = armlex4210_map_io, .init_machine = armlex4210_machine_init, diff --git a/arch/arm/mach-exynos4/mach-nuri.c b/arch/arm/mach-exynos4/mach-nuri.c index 43be71b799cb..6e0536818bf5 100644 --- a/arch/arm/mach-exynos4/mach-nuri.c +++ b/arch/arm/mach-exynos4/mach-nuri.c @@ -1152,7 +1152,7 @@ static void __init nuri_machine_init(void) MACHINE_START(NURI, "NURI") /* Maintainer: Kyungmin Park */ - .boot_params = S5P_PA_SDRAM + 0x100, + .atag_offset = 0x100, .init_irq = exynos4_init_irq, .map_io = nuri_map_io, .init_machine = nuri_machine_init, diff --git a/arch/arm/mach-exynos4/mach-smdkc210.c b/arch/arm/mach-exynos4/mach-smdkc210.c index a7c65e05c1eb..b24ddd7ad8fe 100644 --- a/arch/arm/mach-exynos4/mach-smdkc210.c +++ b/arch/arm/mach-exynos4/mach-smdkc210.c @@ -301,7 +301,7 @@ static void __init smdkc210_machine_init(void) MACHINE_START(SMDKC210, "SMDKC210") /* Maintainer: Kukjin Kim */ - .boot_params = S5P_PA_SDRAM + 0x100, + .atag_offset = 0x100, .init_irq = exynos4_init_irq, .map_io = smdkc210_map_io, .init_machine = smdkc210_machine_init, diff --git a/arch/arm/mach-exynos4/mach-smdkv310.c b/arch/arm/mach-exynos4/mach-smdkv310.c index ea4149556860..d90fcddbee1f 100644 --- a/arch/arm/mach-exynos4/mach-smdkv310.c +++ b/arch/arm/mach-exynos4/mach-smdkv310.c @@ -255,7 +255,7 @@ static void __init smdkv310_machine_init(void) MACHINE_START(SMDKV310, "SMDKV310") /* Maintainer: Kukjin Kim */ /* Maintainer: Changhwan Youn */ - .boot_params = S5P_PA_SDRAM + 0x100, + .atag_offset = 0x100, .init_irq = exynos4_init_irq, .map_io = smdkv310_map_io, .init_machine = smdkv310_machine_init, diff --git a/arch/arm/mach-exynos4/mach-universal_c210.c b/arch/arm/mach-exynos4/mach-universal_c210.c index 0e280d12301e..2aac6f755c8e 100644 --- a/arch/arm/mach-exynos4/mach-universal_c210.c +++ b/arch/arm/mach-exynos4/mach-universal_c210.c @@ -79,7 +79,7 @@ static struct s3c2410_uartcfg universal_uartcfgs[] __initdata = { }; static struct regulator_consumer_supply max8952_consumer = - REGULATOR_SUPPLY("vddarm", NULL); + REGULATOR_SUPPLY("vdd_arm", NULL); static struct max8952_platform_data universal_max8952_pdata __initdata = { .gpio_vid0 = EXYNOS4_GPX0(3), @@ -105,7 +105,7 @@ static struct max8952_platform_data universal_max8952_pdata __initdata = { }; static struct regulator_consumer_supply lp3974_buck1_consumer = - REGULATOR_SUPPLY("vddint", NULL); + REGULATOR_SUPPLY("vdd_int", NULL); static struct regulator_consumer_supply lp3974_buck2_consumer = REGULATOR_SUPPLY("vddg3d", NULL); @@ -762,7 +762,7 @@ static void __init universal_machine_init(void) MACHINE_START(UNIVERSAL_C210, "UNIVERSAL_C210") /* Maintainer: Kyungmin Park */ - .boot_params = S5P_PA_SDRAM + 0x100, + .atag_offset = 0x100, .init_irq = exynos4_init_irq, .map_io = universal_map_io, .init_machine = universal_machine_init, diff --git a/arch/arm/mach-exynos4/setup-usb-phy.c b/arch/arm/mach-exynos4/setup-usb-phy.c index 0883c1b824b9..39aca045f660 100644 --- a/arch/arm/mach-exynos4/setup-usb-phy.c +++ b/arch/arm/mach-exynos4/setup-usb-phy.c @@ -82,7 +82,7 @@ static int exynos4_usb_phy1_init(struct platform_device *pdev) rstcon &= ~(HOST_LINK_PORT_SWRST_MASK | PHY1_SWRST_MASK); writel(rstcon, EXYNOS4_RSTCON); - udelay(50); + udelay(80); clk_disable(otg_clk); clk_put(otg_clk); diff --git a/arch/arm/mach-footbridge/cats-hw.c b/arch/arm/mach-footbridge/cats-hw.c index 5b1a8db779be..a3da5d1106c2 100644 --- a/arch/arm/mach-footbridge/cats-hw.c +++ b/arch/arm/mach-footbridge/cats-hw.c @@ -86,7 +86,7 @@ fixup_cats(struct machine_desc *desc, struct tag *tags, MACHINE_START(CATS, "Chalice-CATS") /* Maintainer: Philip Blundell */ - .boot_params = 0x00000100, + .atag_offset = 0x100, .soft_reboot = 1, .fixup = fixup_cats, .map_io = footbridge_map_io, diff --git a/arch/arm/mach-footbridge/dc21285.c b/arch/arm/mach-footbridge/dc21285.c index 1331fff51ae2..18c32a5541d9 100644 --- a/arch/arm/mach-footbridge/dc21285.c +++ b/arch/arm/mach-footbridge/dc21285.c @@ -18,6 +18,7 @@ #include #include #include +#include