spi: fix the crash when callbacks are not in the IRAM

Introduced in 9c23b8e5 and 4f87a62f. To get higher speed, menuconfig
options are added to put ISR and other functions into the IRAM.  The
interrupt flag ESP_INTR_FLAG_IRAM is also mistakenly set when the ISR is
put into the IRAM. However callbacks, which are wrote by the user, are
called in the master and slave ISR. The user may not be aware of that
these callbacks are not disabled during flash operations. Any cache miss
during flash operation will cause panic.

Essentially IRAM functions and intrrupt flag ESP_INTR_FLAG_IRAM are
different, the latter means not disabling the ISR during flash
operations.  New bus_config flag intr_flags is offered to help set the
interrupt attribute, including priority level, SHARED, IRAM (not
disabled during flash operations).  It introduced a small BREAK to
IDFv3.1 (but the same as IDFv3.0) that the user has to manually set IRAM
flag now (therefore he's aware of the IRAM thing) to void the ISR being
disabled during flash operations.
This commit is contained in:
Michael (XIAO Xufeng)
2018-10-23 16:57:32 +08:00
committed by michael
parent 13155223f3
commit 3387d751d9
7 changed files with 87 additions and 24 deletions

View File

@@ -180,7 +180,8 @@ Speed and Timing Considerations
Transferring speed
^^^^^^^^^^^^^^^^^^
There're two factors limiting the transferring speed: (1) The transaction interval, (2) The SPI clock frequency used.
There're three factors limiting the transferring speed: (1) The transaction interval, (2) The SPI clock frequency used.
(3) The cache miss of SPI functions including callbacks.
When large transactions are used, the clock frequency determines the transferring speed; while the interval effects the
speed a lot if small transactions are used.
@@ -211,6 +212,13 @@ speed a lot if small transactions are used.
2. SPI clock frequency: Each byte transferred takes 8 times of the clock period *8/fspi*. If the clock frequency is
too high, some functions may be limited to use. See :ref:`timing_considerations`.
3. The cache miss: the default config puts only the ISR into the IRAM.
Other SPI related functions including the driver itself and the callback
may suffer from the cache miss and wait for some time while reading code
from the flash. Select :envvar:`CONFIG_SPI_MASTER_IN_IRAM` to put the whole
SPI driver into IRAM, and put the entire callback(s) and its callee
functions into IRAM to prevent this.
For a normal transaction, the overall cost is *20+8n/Fspi[MHz]* [us] for n bytes tranferred
in one transaction. Hence the transferring speed is : *n/(20+8n/Fspi)*. Example of transferring speed under 8MHz
clock speed:
@@ -234,6 +242,15 @@ clock speed:
When the length of transaction is short, the cost of transaction interval is really high. Please try to squash data
into one transaction if possible to get higher transfer speed.
BTW, the ISR is disabled during flash operation by default. To keep sending
transactions during flash operations, enable
:envvar:`CONFIG_SPI_MASTER_ISR_IN_IRAM` and set :cpp:class:`ESP_INTR_FLAG_IRAM`
in the ``intr_flags`` member of :cpp:class:`spi_bus_config_t`. Then all the
transactions queued before the flash operations will be handled by the ISR
continuously during flash operation. Note that the callback of each devices,
and their callee functions, should be in the IRAM in this case, or your
callback will crash due to cache miss.
.. _timing_considerations:
Timing considerations
@@ -286,10 +303,10 @@ And if the host only writes, the *dummy bit workaround* is not used and the freq
| 40 | 80 |
+-------------------+------------------+
The spi master driver can work even if the *input delay* in the ``spi_device_interface_config_t`` is set to 0.
However, setting a accurate value helps to: (1) calculate the frequency limit in full duplex mode, and (2) compensate
the timing correctly by dummy bits in half duplex mode. You may find the maximum data valid time after the launch edge
of SPI clocks in the AC characteristics chapter of the device specifications, or measure the time on a oscilloscope or
The spi master driver can work even if the *input delay* in the ``spi_device_interface_config_t`` is set to 0.
However, setting a accurate value helps to: (1) calculate the frequency limit in full duplex mode, and (2) compensate
the timing correctly by dummy bits in half duplex mode. You may find the maximum data valid time after the launch edge
of SPI clocks in the AC characteristics chapter of the device specifications, or measure the time on a oscilloscope or
logic analyzer.
.. wavedrom don't support rendering pdflatex till now(1.3.1), so we use the png here
@@ -327,18 +344,18 @@ Some typical delays are shown in the following table:
| chip, 12.5ns sample delay included. |
+---------------------------------------+
The MISO path delay(tv), consists of slave *input delay* and master *GPIO matrix delay*, finally determines the
frequency limit, above which the full duplex mode will not work, or dummy bits are used in the half duplex mode. The
The MISO path delay(tv), consists of slave *input delay* and master *GPIO matrix delay*, finally determines the
frequency limit, above which the full duplex mode will not work, or dummy bits are used in the half duplex mode. The
frequency limit is:
*Freq limit[MHz] = 80 / (floor(MISO delay[ns]/12.5) + 1)*
The figure below shows the relations of frequency limit against the input delay. 2 extra apb clocks should be counted
The figure below shows the relations of frequency limit against the input delay. 2 extra apb clocks should be counted
into the MISO delay if the GPIO matrix in the master is used.
.. image:: /../_static/spi_master_freq_tv.png
Corresponding frequency limit for different devices with different *input delay* are shown in the following
Corresponding frequency limit for different devices with different *input delay* are shown in the following
table:
+--------+------------------+----------------------+-------------------+