Logging v1 has been removed and log_strdup wrapper function is no
longer needed. Removing the function and its use in the tree.
Signed-off-by: Krzysztof Chruscinski <krzysztof.chruscinski@nordicsemi.no>
So far, TCP cloned a packet with data on an RX path for the application,
leaving the original packet intact. This isn't really needed, as the
original packet is unconditionally freed later anyway, so the TCP can as
well simply queue the original packet for the application, while
informing the network processing core, that the packet was consumed by
the TCP layer.
This allows to improve the download throughput even further, since the
CPU don't waste time on needles packet copying.
Signed-off-by: Robert Lubos <robert.lubos@nordicsemi.no>
In case TCP stack enters TIMED_WAIT state (after receiving FIN/ACK reply
from peer), it should stil be ready to reply with ACK for any
consecutive FIN attempts. Othewise, in case the final ACK from Zephyr
side is lost, the connection is not properly closed on the other end,
and peer keeps retransmitting the final FIN packet.
Signed-off-by: Robert Lubos <robert.lubos@nordicsemi.no>
Implement a mechanism, according to RFC 813, which allows to prevent so
called "Silly Window Syndrome" - a scenario where the TCP receiver keeps
reporting small window sizes in the acknowledgments, effectively
limiting the connection throughput. This allows to improve performance
in low-buffer configurations, where the maximum window size is small,
and the issue was hitting quite often.
Signed-off-by: Robert Lubos <robert.lubos@nordicsemi.no>
As the stack can safely keep allocating data during retransmission mode
there is no need to take the tx_sem during retransmission any more.
Data stored in the send_data buffer will be transmitted upon the ack of
the data for which an ack is pending. This the application being fully
stalled when the TCP connection enters retransmission mode.
Signed-off-by: Sjors Hettinga <s.a.hettinga@gmail.com>
After the window_full function has been fixed by looking at the
send_data_total instead of the unacked_len. There is no risk
in sending data in transmission mode.
This reverts commit 0088aaefa0.
Signed-off-by: Sjors Hettinga <s.a.hettinga@gmail.com>
To improve the performance with small chunks send, implement Nagle's
algorithm. Provide the option TCP_NODELAY to disable the algorithm.
Signed-off-by: Sjors Hettinga <s.a.hettinga@gmail.com>
In the function net_tcp_queue_data. When an -ENOBUFS by
tcp_send_queued_data is returned, it throws away the whole block of size
len from the send_data. If the len is > MSS, it could happen that the
first section is transmitted, but at the second an -ENOBUFS occurs.
In that case the data is transmitted, but later on removed from the
send_data.
To circumvent this problem, check if the len + unacked_len is smaller
then the send_data_total. If so, the data can safely be removed from
send_data. Otherwise, just pretend the transmission went OK. The
acknowledgment and retransmit path will eventually take care of it.
Signed-off-by: Sjors Hettinga <s.a.hettinga@gmail.com>
The window full computation was corrected to use the send_total instead
of the unacked_len. This conflicted with the new polling implementation
due to the moment when these values are changed.
Move taking the tx_sem outside of tcp_send_queued_data to handle the
-ENOBUF situation properly in case called from net_tcp_queue_data.
net_tcp_queue_data removes data from the send_data in case the
transmission failed with -ENOBUF. This cause the buffer to be not full
any more.
Signed-off-by: Sjors Hettinga <s.a.hettinga@gmail.com>
Log an error to explicitly log a failed buffer allocation in TCP
retransmission. This avoids silently failing retransmissions due to
repeating buffer allocation failures.
Signed-off-by: Sjors Hettinga <s.a.hettinga@gmail.com>
When there was no room to transmit the a next packet to be transmitted,
the -ENOBUFS could cause the retransmission to fail.
Secondly the conn->unacked_len can be set to 0 in the retransmission
process, causing the subscribtion to the transmit timer to fail. Use the
variable send_data_total instead.
Make sure that is the send_data buffer becomes empty the send_timer is
cancelled, but make sure any pending data still keeps on being transmitted.
Signed-off-by: Sjors Hettinga <s.a.hettinga@gmail.com>
The function tcp_data_get tries to update the TCP receive window using
net_context_update_recv_wnd. This function graps the context lock while
the tcp_data_get is called from a situation where it already has tcp lock
is already. Transmission actions do first grab the the context lock and
try to grab the tcp lock afterwards. The combination of both can cause a
deadlock.
By taking the shortcut to directly update the tcp receive window without
going through the net context, the context lock is not required avoiding a
possible deadlock situation.
Signed-off-by: Sjors Hettinga <s.a.hettinga@gmail.com>
In the existing the value received from the other side by the TCP options
is used as MSS for transmission. Since the MSS options are an
announcement rather then a negotionation, it is likely the receiver will
have a different and possibly bigger MSS than allowed by our side.
This allow potentially for different a MSS in the receive and transmit
path.
Directly using the received MSS could cause problems when our MSS is only
allowed to be small. At transmission, for that reason take the minimum of
the received MSS and our desired MSS to find a value compatible to both
sides of the link.
Rename the function to net_tcp_get_recv_mss to net_tcp_get_supported_mss
to better reflect its function in the new situation.
Signed-off-by: Sjors Hettinga <s.a.hettinga@gmail.com>
In the stack both unacked_len and send_data_total track the amount
of data for retransmission. send_data_total actually accounts the
total bytes in the buffer, where unacked_len is used to control the
retransmission progress.
Using unacked_len is sometimes reset to 0, this can lead to more data
being allowd in the send_data buffer. In worse case this can cause
depletion of the net buffers, causing a stall and crash of the connection.
The value send_data_total actually accounts the total amount of data in
the send_data buffer, so it is the proper value to used in the
tcp_window_full function.
Signed-off-by: Sjors Hettinga <s.a.hettinga@gmail.com>
In the function tcp_send_data, the variable conn->unacked_len in copied
into a local variable pos. This value is only used in one location and
used mixed with the original conn->unacked_len.
This fix removes pos and switches to use conn->unacked_len everywhere
to reduce the chance of confusion. This does not functionally change the
code.
Signed-off-by: Sjors Hettinga <s.a.hettinga@gmail.com>
Instead of sending ZWP from send context, when it is detected that
window is full due to zero-window, implement a proper persistent timer,
that is scheduled once zero-window is detected. The timer is responsible
for sending ZWP to the peer and is canceled once non-zero-window is
notified by the peer.
Additionally, in case peer reported zero-window, do not trigger
retransmission from net_tcp_queue_data(), as it won't be transmitted
anyway by the stack.
Signed-off-by: Robert Lubos <robert.lubos@nordicsemi.no>
The semaphore is reset when TCP layer would normally reject transfer
request (either due to TX window being full or entering retransmission
mode). Once data is acnowledged, or the reatransmission is done, the
semaphore is set again.
Upper layers can monitor the semaphore with `k_poll()` instead of
waiting blindly before attempting to transmit again.
Signed-off-by: Robert Lubos <robert.lubos@nordicsemi.no>
Make use of the status argument in the recv_cb() callback function -
instead of blindly reporting ECONNRESET whenever TCP context is
dereferenced, indicate whether an actual error condition happened (by
setting respective errno value) or a graceful shutdown took place (by
setting status to 0).
Signed-off-by: Robert Lubos <robert.lubos@nordicsemi.no>
After introducing SO_SNDBUF socket option, a possible deadlock situation
slipped into the TCP implementation. The scenario for the deadlock:
* application thread tries to send some data, it enters
net_context_send() which locks the context mutex,
* internal context_sendto() blocks on a TX packet allocation, if the
TX pool is empty rescheduling takes place,
* now, if at the same time some incoming packet has arrived (ACK for
example), TCP stack enters tcp_in() function from a different
thread. The function locks the TCP connection mutex, and tries to
obtain the SNDBUF option value. net_context_get_option() tries to
lock the context mutex, but it is already held by the transmitting
thread, so the receiver thread blocks
* when TX packet is available again, the transmitting thread unblocks
and tries to pass the packet down to TCP stack. net_tcp_queue_data()
is called which attempts to lock the TCP connection mutex, but it is
already held by the receiving thread. Both threads are in a deadlock
now with no chance to recover.
Fix this, by obtaining the SNDBUF option value in tcp_in() before
locking the TCP connection mutex.
Signed-off-by: Robert Lubos <robert.lubos@nordicsemi.no>
In order to bring consistency in-tree, migrate all subsystems code to
the new prefix <zephyr/...>. Note that the conversion has been scripted,
refer to zephyrproject-rtos#45388 for more details.
Signed-off-by: Gerard Marull-Paretas <gerard.marull@nordicsemi.no>
Instead of using a fixed fin timeout, compute it based on the number
of retries. Fixes issue found by PR 44545.
Signed-off-by: Sjors Hettinga <s.a.hettinga@gmail.com>
Introduce set/get SO_SNDBUF option using the setsockopt
function. In addition, for TCP, check the sndbuf value
before queuing data.
Signed-off-by: Mohan Kumar Kumar <mohankm@fb.com>
Introduce set/get SO_RCVBUF option using the setsockopt
function. In addition, use the rcvbuf value to set the
tcp recv window.
Signed-off-by: Mohan Kumar Kumar <mohankm@fb.com>
When connect() is called on a TCP socket, tcp_in() is called with a NULL
packet to start establishing a connection. That in turn leads to a SYN
packet being produced which, depending on the Ethernet driver, may
result in a synchronous transmit of that packet. After that, the
connect() implementation, which at this point is executing
net_tcp_connect() starts waiting to take a semaphore until the
connection timeout is reached. However, if the transmit of the SYN
packet results in a RST packet being returned from the connection
destination (due to there being no listening socket) very quickly on a
local network, the device driver may deliver an interrupt which can
cause the receive path of the network stack to run, resulting in the
tcp_in() of the RST packet via the network RX thread. That can cause
tcp_conn_unref() to be called before the connecting thread has gotten
to the point of acquiring (or failing to) the semaphore, which results
in a deinitialized semaphore being accessed.
This commit fixes the possible race condition by ensuring that the
connection lock mutex is held until after the connection state moves
to "in connect."
Fixes#44186
Signed-off-by: Berend Ozceri <berend@recogni.com>
When TCP stack enters retransmission mode, the variable tracking the
amount of unacknowledged data is cleared. This prevents the stack from
detecting when TX window is full, which could lead to queueing unlimited
amount of data, effectively consuming all of the avaiable network
buffers.
Prevent this, by returning early from net_tcp_queue_data() in case TCP
stack is in retransmission mode. The socket layer will take care of
retrying just as in case the window is full.
Signed-off-by: Robert Lubos <robert.lubos@nordicsemi.no>
In case a loopback or own address is used in TCP connection, the TCP
stack delegates the acatual data send to a workqueue. This is fine,
however it could lead to some aritificial delays in case a lot of data
is being sent before the workqueue has a chance to execute queued work
items. In such case, we only sent a single packet, when many could've
already been queued.
Fix this, by resubmitting the queue in case a local address is used, and
there's still more packets pending for send.
Signed-off-by: Robert Lubos <robert.lubos@nordicsemi.no>
When peer reports a zero length receive window, the TCP stack block any
outgoing data from being queued. In case no further ACK comes from the
peer, the whole communication could stall. Fix this by sending a simple
Zero Window Probe, when we detect a Zero Length Window.
Signed-off-by: Robert Lubos <robert.lubos@nordicsemi.no>
Closing a listening socket will set the accept callback to NULL.
This could lead to a crash, in case an already received packet,
finalizing the connection handshake, was processed after the socket was
closed. Thereby, it's needed to verify if the callback is actually set
before processing it.
Signed-off-by: Robert Lubos <robert.lubos@nordicsemi.no>
TCP processed IPv4/IPv6 packets w/o verifying first if IPv4/IPv6 is
enabled in the system. This could lead to problems especially for IPv6,
where in case it's disabled the sockaddr structure is not large enough
to accomodate IPv6 address, leading to possible out-of-bound access on
the sockaddr structure.
Fix this by adding appropriate checks where applicable.
Signed-off-by: Robert Lubos <robert.lubos@nordicsemi.no>
Peer may send a zero-length keepalive message, probing the recv window
size - TCP stack should still reply for such packets, otherwise
connection will stall.
Signed-off-by: Robert Lubos <robert.lubos@nordicsemi.no>
Add implementation of net_tcp_update_recv_wnd() function.
Move the window deacreasing code to the tcp module - receive window
has to be decreased before sending ACK, which was not possible when
window was decreased in the receive callback function.
Signed-off-by: Robert Lubos <robert.lubos@nordicsemi.no>
This reverts commit e7489d8de7.
And fixes the deadlock by allowing only 1 thread to actualy clean up
the connection when the ref_count is 0.
Signed-off-by: Daniel Nejezchleb <dnejezchleb@hwg.cz>
Unlock tcp_lock when calling the recv_cb. In case when
a connection is being closed from both the tcp stack
and the application, a race condition can happen resulting
in locking each other out on tcp_lock and socket lock.
Signed-off-by: Daniel Nejezchleb <dnejezchleb@hwg.cz>
Increments send retry every time
after the tcp_send_data when resending.
That way unhandled return values can time
out after set amount of tcp_retries.
Signed-off-by: Daniel Nejezchleb <dnejezchleb@hwg.cz>
A common pattern here was to take the work item as the subfield of a
containing object. But the contained field is not a k_work, it's a
k_work_delayable.
Things were working only because the work field was first, so the
pointers had the same value. Do things right and fix things to
produce correct code if/when that field ever moves within delayable.
Signed-off-by: Yong Cong Sin <yongcong.sin@gmail.com>
Replace unpacked in6_addr structures with raw buffers in net_ipv6_hdr
struct, to prevent compiler warnings about unaligned access.
Remove __packed parameter from `struct net_6lo_context` since the
structure isn't really serialized.
Signed-off-by: Robert Lubos <robert.lubos@nordicsemi.no>
Replace unpacked in_addr structures with raw buffers in net_ipv4_hdr
struct, to prevent compiler warnings about unaligned access.
Signed-off-by: Robert Lubos <robert.lubos@nordicsemi.no>
TCP2 is no longer needed as it is the unique implementation since the
legacy one has been removed.
Signed-off-by: Tomasz Bursztyka <tomasz.bursztyka@linux.intel.com>
Remove legacy TCP stack as it is replaced by the new TCP2 stack.
The TCP2 stack has been the default stack since 2.4 release.
Signed-off-by: Jukka Rissanen <jukka.rissanen@linux.intel.com>
Currently there is no way to distinguish between a caller
explicitly asking for a semaphore with a limit that
happens to be `UINT_MAX` and a semaphore that just
has a limit "as large as possible".
Add `K_SEM_MAX_LIMIT`, currently defined to `UINT_MAX`, and akin
to `K_FOREVER` versus just passing some very large wait time.
In addition, the `k_sem_*` APIs were type-confused, where
the internal data structure was `uint32_t`, but the APIs took
and returned `unsigned int`. This changes the underlying data
structure to also use `unsigned int`, as changing the APIs
would be a (potentially) breaking change.
These changes are backwards-compatible, but it is strongly suggested
to take a quick scan for `k_sem_init` and `K_SEM_DEFINE` calls with
`UINT_MAX` (or `UINT32_MAX`) and replace them with `K_SEM_MAX_LIMIT`
where appropriate.
Signed-off-by: James Harris <james.harris@intel.com>
Uses of k_work_pending are to be replaced by k_work_is_pending which
conforms to current proposed naming guidelines.
Both uses in this file are fragile: that a work item is pending does
not mean changes since it was first submitted are guaranteed to be
seen when the work item begins (began) executing.
As long as this module is expected to be replaced by tcp2 it doesn't
seem worth trying to fix the logic, so just switch to the new function
name.
Signed-off-by: Peter Bigot <peter.bigot@nordicsemi.no>
In TCP, we increase the net_pkt refcount in order to resend
it later if we do not receive ACK in time. Because we are not
getting a new net_pkt, the TXTIME statistics would be calculated
incorrectly. So if we re-send, reset the net_pkt creation time.
Signed-off-by: Jukka Rissanen <jukka.rissanen@linux.intel.com>
A net context in LISTENING mode waits for incoming connections, once
a new connection is established a new net context is spawned which
is responsible for handling the new connection.
Therefore when closing a LISTENING context it is not useful to send FIN
as it is never connected. Actually closing the connection would be done
by calling close on the spawned net context which is returned by the
accept call.
Signed-off-by: Léonard Bise <leonard.bise@gmail.com>