Commit graph

221 commits

Author SHA1 Message Date
Robert Lubos
5af3c6ca90 net: tcp: Fix possible deadlock in tcp_in()
After introducing SO_SNDBUF socket option, a possible deadlock situation
slipped into the TCP implementation. The scenario for the deadlock:

  * application thread tries to send some data, it enters
    net_context_send() which locks the context mutex,
  * internal context_sendto() blocks on a TX packet allocation, if the
    TX pool is empty rescheduling takes place,
  * now, if at the same time some incoming packet has arrived (ACK for
    example), TCP stack enters tcp_in() function from a different
    thread. The function locks the TCP connection mutex, and tries to
    obtain the SNDBUF option value. net_context_get_option() tries to
    lock the context mutex, but it is already held by the transmitting
    thread, so the receiver thread blocks
  * when TX packet is available again, the transmitting thread unblocks
    and tries to pass the packet down to TCP stack. net_tcp_queue_data()
    is called which attempts to lock the TCP connection mutex, but it is
    already held by the receiving thread. Both threads are in a deadlock
    now with no chance to recover.

Fix this, by obtaining the SNDBUF option value in tcp_in() before
locking the TCP connection mutex.

Signed-off-by: Robert Lubos <robert.lubos@nordicsemi.no>
2022-05-11 11:04:22 +02:00
Gerard Marull-Paretas
5113c1418d subsystems: migrate includes to <zephyr/...>
In order to bring consistency in-tree, migrate all subsystems code to
the new prefix <zephyr/...>. Note that the conversion has been scripted,
refer to zephyrproject-rtos#45388 for more details.

Signed-off-by: Gerard Marull-Paretas <gerard.marull@nordicsemi.no>
2022-05-09 12:07:35 +02:00
Sjors Hettinga
9392b12d4b net: tcp: Set the FIN_TIMEOUT to allow all FIN retries
Instead of using a fixed fin timeout, compute it based on the number
of retries. Fixes issue found by PR 44545.

Signed-off-by: Sjors Hettinga <s.a.hettinga@gmail.com>
2022-04-27 10:08:07 +02:00
Mohan Kumar Kumar
f105ea6ef5 net: add sndbuf socket option
Introduce set/get SO_SNDBUF option using the setsockopt
function. In addition, for TCP, check the sndbuf value
before queuing data.

Signed-off-by: Mohan Kumar Kumar <mohankm@fb.com>
2022-04-11 10:23:31 +02:00
Mohan Kumar Kumar
5d37de8551 net: add rcvbuf socket option
Introduce set/get SO_RCVBUF option using the setsockopt
function. In addition, use the rcvbuf value to set the
tcp recv window.

Signed-off-by: Mohan Kumar Kumar <mohankm@fb.com>
2022-04-01 13:30:09 +02:00
Berend Ozceri
782c7b6026 net: tcp: Fix possible race condition in connection establishment
When connect() is called on a TCP socket, tcp_in() is called with a NULL
packet to start establishing a connection. That in turn leads to a SYN
packet being produced which, depending on the Ethernet driver, may
result in a synchronous transmit of that packet. After that, the
connect() implementation, which at this point is executing
net_tcp_connect() starts waiting to take a semaphore until the
connection timeout is reached. However, if the transmit of the SYN
packet results in a RST packet being returned from the connection
destination (due to there being no listening socket) very quickly on a
local network, the device driver may deliver an interrupt which can
cause the receive path of the network stack to run, resulting in the
tcp_in() of the RST packet via the network RX thread. That can cause
tcp_conn_unref() to be called before the connecting thread has gotten
to the point of acquiring (or failing to) the semaphore, which results
in a deinitialized semaphore being accessed.

This commit fixes the possible race condition by ensuring that the
connection lock mutex is held until after the connection state moves
to "in connect."

Fixes #44186

Signed-off-by: Berend Ozceri <berend@recogni.com>
2022-03-31 14:30:23 -05:00
Robert Lubos
0088aaefa0 net: tcp: Do not accept new data in retransmission mode
When TCP stack enters retransmission mode, the variable tracking the
amount of unacknowledged data is cleared. This prevents the stack from
detecting when TX window is full, which could lead to queueing unlimited
amount of data, effectively consuming all of the avaiable network
buffers.

Prevent this, by returning early from net_tcp_queue_data() in case TCP
stack is in retransmission mode. The socket layer will take care of
retrying just as in case the window is full.

Signed-off-by: Robert Lubos <robert.lubos@nordicsemi.no>
2022-03-30 11:08:22 +02:00
Robert Lubos
2e9866f88f net: tcp: Retrigger the send work in case of loopback
In case a loopback or own address is used in TCP connection, the TCP
stack delegates the acatual data send to a workqueue. This is fine,
however it could lead to some aritificial delays in case a lot of data
is being sent before the workqueue has a chance to execute queued work
items. In such case, we only sent a single packet, when many could've
already been queued.

Fix this, by resubmitting the queue in case a local address is used, and
there's still more packets pending for send.

Signed-off-by: Robert Lubos <robert.lubos@nordicsemi.no>
2022-03-30 11:08:22 +02:00
Robert Lubos
ce93fae60b net: tcp: Simple zero window probe implementation
When peer reports a zero length receive window, the TCP stack block any
outgoing data from being queued. In case no further ACK comes from the
peer, the whole communication could stall. Fix this by sending a simple
Zero Window Probe, when we detect a Zero Length Window.

Signed-off-by: Robert Lubos <robert.lubos@nordicsemi.no>
2022-03-30 11:08:22 +02:00
Nazar Kazakov
f483b1bc4c everywhere: fix typos
Fix a lot of typos

Signed-off-by: Nazar Kazakov <nazar.kazakov.work@gmail.com>
2022-03-18 13:24:08 -04:00
Robert Lubos
81b92fcbb3 net: tcp: Verify accept callback before use
Closing a listening socket will set the accept callback to NULL.
This could lead to a crash, in case an already received packet,
finalizing the connection handshake, was processed after the socket was
closed. Thereby, it's needed to verify if the callback is actually set
before processing it.

Signed-off-by: Robert Lubos <robert.lubos@nordicsemi.no>
2022-03-16 16:23:16 +01:00
Robert Lubos
2f2c356ca7 net: tcp: Verify if IPv4/IPv6 is enabled before processing addr family
TCP processed IPv4/IPv6 packets w/o verifying first if IPv4/IPv6 is
enabled in the system. This could lead to problems especially for IPv6,
where in case it's disabled the sockaddr structure is not large enough
to accomodate IPv6 address, leading to possible out-of-bound access on
the sockaddr structure.

Fix this by adding appropriate checks where applicable.

Signed-off-by: Robert Lubos <robert.lubos@nordicsemi.no>
2022-03-16 04:21:34 -07:00
Henrik Brix Andersen
e9c9caa80d net: remove unmaintained 6LoCAN implementation
Remove the unmaintained, experimental 6LoCAN (IPv6 over CAN bus)
implementation.

Fixes: #42559

Signed-off-by: Henrik Brix Andersen <hebad@vestas.com>
2022-03-09 18:07:31 +01:00
Robert Lubos
4f5a19482b net: tcp: Make receive window size configurable
Add Kconfig option to set the receive window size.

Signed-off-by: Robert Lubos <robert.lubos@nordicsemi.no>
2022-02-22 10:09:45 -08:00
Robert Lubos
c25cd5a894 net: tcp: Allow to acknowledge zero-length packets
Peer may send a zero-length keepalive message, probing the recv window
size - TCP stack should still reply for such packets, otherwise
connection will stall.

Signed-off-by: Robert Lubos <robert.lubos@nordicsemi.no>
2022-02-22 10:09:45 -08:00
Robert Lubos
74bc876bf5 net: tcp: Implement receive window handling
Add implementation of net_tcp_update_recv_wnd() function.

Move the window deacreasing code to the tcp module - receive window
has to be decreased before sending ACK, which was not possible when
window was decreased in the receive callback function.

Signed-off-by: Robert Lubos <robert.lubos@nordicsemi.no>
2022-02-22 10:09:45 -08:00
Daniel Nejezchleb
1f22a2bc3c net: tcp: Fix possible deadlock in tcp_conn_unref()
This reverts commit e7489d8de7.

And fixes the deadlock by allowing only 1 thread to actualy clean up
the connection when the ref_count is 0.

Signed-off-by: Daniel Nejezchleb <dnejezchleb@hwg.cz>
2022-02-10 10:59:03 +01:00
Daniel Nejezchleb
e7489d8de7 net: tcp: Fix possible deadlock in tcp_conn_unref()
Unlock tcp_lock when calling the recv_cb. In case when
a connection is being closed from both the tcp stack
and the application, a race condition can happen resulting
in locking each other out on tcp_lock and socket lock.

Signed-off-by: Daniel Nejezchleb <dnejezchleb@hwg.cz>
2022-02-07 09:28:25 -05:00
Daniel Nejezchleb
208927d640 net: tcp: Fixed forever loop in tcp_resend_data
Increments send retry every time
after the tcp_send_data when resending.
That way unhandled return values can time
out after set amount of tcp_retries.

Signed-off-by: Daniel Nejezchleb <dnejezchleb@hwg.cz>
2022-02-04 10:56:25 +01:00
Yong Cong Sin
731241f8d0 kernel: workq: Fix type errors in delayable work handlers
A common pattern here was to take the work item as the subfield of a
containing object. But the contained field is not a k_work, it's a
k_work_delayable.

Things were working only because the work field was first, so the
pointers had the same value. Do things right and fix things to
produce correct code if/when that field ever moves within delayable.

Signed-off-by: Yong Cong Sin <yongcong.sin@gmail.com>
2022-02-02 18:43:12 -05:00
Robert Lubos
666e9f80d6 net: ipv6: Remove in6_addr from packed net_ipv6_hdr struct
Replace unpacked in6_addr structures with raw buffers in net_ipv6_hdr
struct, to prevent compiler warnings about unaligned access.

Remove __packed parameter from `struct net_6lo_context` since the
structure isn't really serialized.

Signed-off-by: Robert Lubos <robert.lubos@nordicsemi.no>
2021-11-25 10:46:35 -05:00
Robert Lubos
064200b420 net: ipv4: Remove in_addr from packed net_ipv4_hdr struct
Replace unpacked in_addr structures with raw buffers in net_ipv4_hdr
struct, to prevent compiler warnings about unaligned access.

Signed-off-by: Robert Lubos <robert.lubos@nordicsemi.no>
2021-11-25 10:46:35 -05:00
Tomasz Bursztyka
32db35a721 net/tcp: Rename TCP2 to TCP
TCP2 is no longer needed as it is the unique implementation since the
legacy one has been removed.

Signed-off-by: Tomasz Bursztyka <tomasz.bursztyka@linux.intel.com>
2021-11-11 07:26:41 -05:00
Jukka Rissanen
0af89fc4ec net: Remove legacy TCP stack
Remove legacy TCP stack as it is replaced by the new TCP2 stack.
The TCP2 stack has been the default stack since 2.4 release.

Signed-off-by: Jukka Rissanen <jukka.rissanen@linux.intel.com>
2021-03-22 13:06:54 -04:00
James Harris
b10428163a kernel: sem: add K_SEM_MAX_LIMIT
Currently there is no way to distinguish between a caller
explicitly asking for a semaphore with a limit that
happens to be `UINT_MAX` and a semaphore that just
has a limit "as large as possible".

Add `K_SEM_MAX_LIMIT`, currently defined to `UINT_MAX`, and akin
to `K_FOREVER` versus just passing some very large wait time.

In addition, the `k_sem_*` APIs were type-confused, where
the internal data structure was `uint32_t`, but the APIs took
and returned `unsigned int`. This changes the underlying data
structure to also use `unsigned int`, as changing the APIs
would be a (potentially) breaking change.

These changes are backwards-compatible, but it is strongly suggested
to take a quick scan for `k_sem_init` and `K_SEM_DEFINE` calls with
`UINT_MAX` (or `UINT32_MAX`) and replace them with `K_SEM_MAX_LIMIT`
where appropriate.

Signed-off-by: James Harris <james.harris@intel.com>
2021-03-05 08:13:53 -06:00
Peter Bigot
e373004135 net: tcp: switch to new API for k_work_pending
Uses of k_work_pending are to be replaced by k_work_is_pending which
conforms to current proposed naming guidelines.

Both uses in this file are fragile: that a work item is pending does
not mean changes since it was first submitted are guaranteed to be
seen when the work item begins (began) executing.

As long as this module is expected to be replaced by tcp2 it doesn't
seem worth trying to fix the logic, so just switch to the new function
name.

Signed-off-by: Peter Bigot <peter.bigot@nordicsemi.no>
2021-03-04 18:00:56 -05:00
Jukka Rissanen
0f5fdec0d7 net: tcp: Reset net_pkt creation time if packet is resent
In TCP, we increase the net_pkt refcount in order to resend
it later if we do not receive ACK in time. Because we are not
getting a new net_pkt, the TXTIME statistics would be calculated
incorrectly. So if we re-send, reset the net_pkt creation time.

Signed-off-by: Jukka Rissanen <jukka.rissanen@linux.intel.com>
2020-08-07 10:12:51 +03:00
Léonard Bise
14ced754f5 net: tcp: Do not send FIN when closing listening sockets
A net context in LISTENING mode waits for incoming connections, once
a new connection is established a new net context is spawned which
is responsible for handling the new connection.
Therefore when closing a LISTENING context it is not useful to send FIN
as it is never connected. Actually closing the connection would be done
by calling close on the spawned net context which is returned by the
accept call.

Signed-off-by: Léonard Bise <leonard.bise@gmail.com>
2020-06-16 23:47:40 +03:00
Kumar Gala
a1b77fd589 zephyr: replace zephyr integer types with C99 types
git grep -l 'u\(8\|16\|32\|64\)_t' | \
		xargs sed -i "s/u\(8\|16\|32\|64\)_t/uint\1_t/g"
	git grep -l 's\(8\|16\|32\|64\)_t' | \
		xargs sed -i "s/s\(8\|16\|32\|64\)_t/int\1_t/g"

Signed-off-by: Kumar Gala <kumar.gala@linaro.org>
2020-06-08 08:23:57 -05:00
Jukka Rissanen
35dbe1c7d5 net: tcp: Refactor because of timeout overhaul
Convert to use k_timeout_t

Signed-off-by: Jukka Rissanen <jukka.rissanen@linux.intel.com>
2020-04-09 16:07:03 +03:00
Peter Bigot
cbce6dde4c net: tcp: validate pointer before dereferencing it
See https://habr.com/en/company/pvs-studio/blog/495284/ fragment 8.

Signed-off-by: Peter Bigot <peter.bigot@nordicsemi.no>
2020-04-06 22:09:12 -04:00
Jukka Rissanen
aee31bb7c1 net: tcp: Print information when proper ACK is received
It is useful to know that we received ACK when debugging the
TCP code.

Signed-off-by: Jukka Rissanen <jukka.rissanen@linux.intel.com>
2020-03-17 13:13:58 +02:00
Jukka Rissanen
8fd677e9e1 net: tcp: Fix memory leak when lot of incoming packets
The code was leaking memory in TX side when there was lot of
incoming packets. The reason was that the net_pkt_sent() flag
was manipulated in two threads which caused races. The solution
is to move the sent flag check only to tcp.c.

Fixes #23246

Signed-off-by: Jukka Rissanen <jukka.rissanen@linux.intel.com>
2020-03-17 13:13:58 +02:00
Oleg Zhurakivskyy
8a77b53053 net: core: Drop NET_ASSERT_INFO() macro
A single assert macro can serve a purpose here.

Signed-off-by: Oleg Zhurakivskyy <oleg.zhurakivskyy@intel.com>
2020-01-08 14:10:21 +02:00
Ravi kumar Veeramally
cf9ad748ba net: ipv4: Add IPv4 options length to net pkt
IPv4 header options length will be stored in ipv4_opts_len
in net_pkt structure. Now IPv4 header length will be in
net_pkt ip_hdr_len + ipv4_opts_len. So modified relevant
places of ip header length calculation for IPv4.

Signed-off-by: Ravi kumar Veeramally <ravikumar.veeramally@linux.intel.com>
2019-12-16 11:35:24 +02:00
Jukka Rissanen
1274850041 net: tcp: No need to unref pkt if it was not sent
In order to avoid net_pkt ref count going to <0, do not unref
the packet if it was not sent in the first place. This can happen
if the connection was closed while we are waiting packets to be sent.

Signed-off-by: Jukka Rissanen <jukka.rissanen@linux.intel.com>
2019-10-28 13:58:59 +02:00
Jukka Rissanen
44dc3a2ebb net: tcp: Allow state transition when socket is closed
Do not print warning if transitioning from LISTEN -> CLOSED which
happens when the socket is closed.

Signed-off-by: Jukka Rissanen <jukka.rissanen@linux.intel.com>
2019-10-28 13:58:59 +02:00
Jukka Rissanen
21c10e50df net: tcp: Allow initial state to be set by net_tcp_change_state()
The initial state from CLOSED -> ESTABLISHED caused error
to be printed by state validator. This is unnecessary, so add
this as a valid state to validator.

Signed-off-by: Jukka Rissanen <jukka.rissanen@linux.intel.com>
2019-10-28 13:58:59 +02:00
Jukka Rissanen
d88f25bd76 net: tcp: Handle special case where accepted socket is closed
Handle this corner case with TCP connection closing:

1) Client A connects, it is accepted and can send data to us
2) Client B connects, the application needs to call accept()
   before we will receive any data from client A to the application.
   The app has not yet called accept() at this point (for
   whatever reason).
3) Client B then disconnects and we receive FIN. The connection
   cleanup is a bit tricky as the client is in half-connected state
   meaning that the connection is in established state but the
   accept_q in socket queue contains still data which needs to be
   cleared.
4) Client A then disconnects, all data is sent etc

The above was not working correctly as the system did not handle the
step 3) properly. The client B was accepted in the application even
if the connection was closing.

After this commit, the commit called "net: tcp: Accept connections
only in LISTENING state" and related other commits are no longer
needed and are reverted.

Signed-off-by: Jukka Rissanen <jukka.rissanen@linux.intel.com>
2019-10-28 13:58:59 +02:00
Jukka Rissanen
e73d5a6479 Revert "net: tcp: Accept connections only in LISTENING state"
This reverts commit 1a6f4a6368.

Let's try to fix the backlog handling instead of this.

Signed-off-by: Jukka Rissanen <jukka.rissanen@linux.intel.com>
2019-10-28 13:58:59 +02:00
Jukka Rissanen
97b6588976 net: tcp: When closing the connection send FIN without extra delays
The earlier code was always queuing the FIN that is sent when
connection is closed. This caused long delay (200 ms) before the peer at
the other end noticed that the connection was actually closed.
Now check if there is nothing in the queue, then send the FIN
immediately. If there is some data in the queue, flush it when a valid
ack has been received.

Fixes #19678

Signed-off-by: Jukka Rissanen <jukka.rissanen@linux.intel.com>
2019-10-17 17:02:18 +03:00
Ravi kumar Veeramally
1a6f4a6368 net: tcp: Accept connections only in LISTENING state
Issue noticed with following scenario.

 1) TCP server is listening for connections but will handle
    only one connection at a time (e.g. echo-server sample)
 2) Client A connects, and the connection is accepted.
 3) Client B connects, instead of denying a connection,
    it is "auto" accepted (this is the actual bug) even
    if the application has not called accept().
 4) After the connection A is closed, the connection B
    gets accepted by application but now the closed
    connection A will cause confusion in the net-stack
 5) This confusion can cause memory leak or double free
    in the TCP core.

It is not easy to trigger this issue because it depends
on timing of the connections A & B.

Fixes: #18308

Signed-off-by: Ravi kumar Veeramally <ravikumar.veeramally@linux.intel.com>
2019-09-10 22:57:48 +03:00
Jukka Rissanen
29cae7e2fa net: tcp: Cleanup context if connection is not established
If we are closing connection before the connection was established,
then unref the context so that the cleanup is done properly.

Signed-off-by: Jukka Rissanen <jukka.rissanen@linux.intel.com>
2019-09-10 22:53:12 +03:00
Alexander Wachter
35f01673ac net: l2: 6LoCAN implementation
This commit is an implementation of 6LoCAN, a 6Lo adaption layer for
Controller Area Networks. 6LoCAN is not yet standardised.

Signed-off-by: Alexander Wachter <alexander.wachter@student.tugraz.at>
2019-08-08 13:25:01 +03:00
Jukka Rissanen
b68d6a9a26 net: tcp: Adjust data length if TCP options are present
Skip the TCP options before giving the data to application.
Without this, the TCP options would be passed to the application.

Fixes #17055

Signed-off-by: Jukka Rissanen <jukka.rissanen@linux.intel.com>
2019-07-08 10:27:33 +03:00
Anas Nashif
5d001f3e41 cleanup: include/: move misc/byteorder.h to sys/byteorder.h
move misc/byteorder.h to sys/byteorder.h and
create a shim for backward-compatibility.

No functional changes to the headers.
A warning in the shim can be controlled with CONFIG_COMPAT_INCLUDES.

Related to #16539

Signed-off-by: Anas Nashif <anas.nashif@intel.com>
2019-06-27 22:55:49 -04:00
Anas Nashif
f2cb20c772 docs: fix misspelling across the tree
Found a few annoying typos and figured I better run script and
fix anything it can find, here are the results...

Signed-off-by: Anas Nashif <anas.nashif@intel.com>
2019-06-19 15:34:13 -05:00
Jukka Rissanen
30d31936a0 Revert "net: tcp: Fix ref counting for the net_pkt"
This reverts commit 9cd547f53b.

The commit we are reverting, fixed originally the issue that was
seen with zperf. There we freed the net_pkt too early while it was
still waiting for a TCP ACK. The commit 9cd547f5 seemd to fix that
issue but it was causing issues in dump_http_server sample app which
then started to leak memory. No issues were seen with echo-server
with or without the commit 9cd547f5.

So the lessons learned here is that one needs to test with multiple
network sample apps like dump_http_server, echo_server and zperf
before considering TCP fixes valid, especially fixes that touch
ref counting issues.

Fixes #15031

The next commit will fix the zperf free memory access patch.

Signed-off-by: Jukka Rissanen <jukka.rissanen@linux.intel.com>
2019-04-01 12:39:32 -04:00
Patrik Flykt
24d71431e9 all: Add 'U' suffix when using unsigned variables
Add a 'U' suffix to values when computing and comparing against
unsigned variables.

Signed-off-by: Patrik Flykt <patrik.flykt@intel.com>
2019-03-28 17:15:58 -05:00
Jukka Rissanen
9cd547f53b net: tcp: Fix ref counting for the net_pkt
The network packet ref count was not properly increased when
the TCP was retried. This meant that the second time the packet
was sent, the device driver managed to release the TCP frame even
if we had not got ACK to it.

Somewhat long debug log follows:

The net_pkt 0x08072d5c is created, we write 1K data into it, initial ref
count is 1.

net_pkt_write: pkt 0x08072d5c data 0x08075d40 length 1024
net_tcp_queue_data: Queue 0x08072d5c len 1024
net_tcp_trace: pkt 0x08072d5c src 5001 dst 5001
net_tcp_trace:    seq 0x15d2aa09 (366127625) ack 0x7f67d918
net_tcp_trace:    flags uAPrsf
net_tcp_trace:    win 1280 chk 0x0bea
net_tcp_queue_pkt: pkt 0x08072d5c new ref 2 (net_tcp_queue_pkt:850)

At this point, the ref is 2. Then the packet is sent as you see below.

net_pkt_ref_debug: TX [13] pkt 0x08072d5c ref 2 net_tcp_queue_pkt():850
net_tcp_send_data: Sending pkt 0x08072d5c (1084 bytes)
net_pkt_unref_debug: TX [13] pkt 0x08072d5c ref 1 (ethernet_send():597)

Ref is still correct, packet is still alive. We have not received ACK,
so the packet is resent.

tcp_retry_expired: ref pkt 0x08072d5c new ref 2 (tcp_retry_expired:233)
net_pkt_ref_debug: TX [10] pkt 0x08072d5c ref 2 tcp_retry_expired():233
net_pkt_unref_debug: TX [10] pkt 0x08072d5c ref 1 ... (net_if_tx():173)
net_pkt_unref_debug: TX [10] pkt 0x08072d5c ref 0 ... (net_if_tx():173)

Reference count is now wrong, it should have been 1. This is because we
did not increase the ref count when packet was placed first time into
sent list in tcp.c:tcp_retry_expired().

The fix is quite simple as you can see from this commit.

Signed-off-by: Jukka Rissanen <jukka.rissanen@linux.intel.com>
2019-03-26 07:29:26 -05:00