Server Tries to Disconnect from Itself

Post here if you need help setting up your server, etc.
TechnoCat
On Lightcycle Grid
Posts: 31
Joined: Sat Mar 26, 2011 7:32 am

Server Tries to Disconnect from Itself

Post by TechnoCat »

I have a dedicated server that repeatedly outputs

Code: Select all

[0] User 0 timed out.
[0] Warning in void sn_DisconnectUser(int, const tOutput&, nServerInfoBase*) in /Users/manuel/Documents/Tron/Release/armagetronad-0.2.8.3.1/MacOS/../src/network/nNetwork.cpp:3286 : 
   [0] Server tried to disconnect from itself.
after appearing to successfully connect to and sync with all four master servers.

This is the fourth time this has happened; the server was working fine yesterday (apart from being inaccessible from outside the LAN it's on) .

Complete output:

Code: Select all

[0] Screen name "techno_cat" reserved for user TechnoCat@forums.
[0] Screen name "technocat" reserved for user TechnoCat@forums.
[0] Screen name "tc" reserved for user TechnoCat@forums.
[0] Screen name "tc_techno_cat" reserved for user TechnoCat@forums.
[0] Screen name "tctechnocat" reserved for user TechnoCat@forums.
repeats several hundred times (747 this time) , then

Code: Select all

[0] Bound socket to *.*.*.*:56677.
[0] Bound socket to *.*.*.*:4534.
[0] Setting SPAM_MAXLEN (Group: Cheating) deviates from its default value; clients older than 0.2.6.0 will not be allowed in.
[0] Setting CYCLE_RUBBER_TIME (Group: Visual) deviates from its default value; clients older than 0.2.8_beta2 may experience problems.
[0] Setting CYCLE_ACCEL_TUNNEL (Group: Bumpy) deviates from its default value; clients older than 0.2.8.3_alpha may experience problems.
[0] Connecting to Master Server 1...
[0] Killing user 0, ping 1.25.
[0] Login information sent. Waiting for reply...
[0] Got address 76.203.173.136:4534.
[0] Got login denial...
[0] Logging out...
[0] Done!
[0] Killing user 0, ping 1.25.
[0] Login Succeeded. User Nr. 0 
[0] Syncing with server...
[0] Relabeling NetObjects...
[0] Syncing again...
[0] Done!
[0] Sending my server info...
[0] Closing socket bound to *.*.*.*:4534
[0] Bound socket to *.*.*.*:4534.
[0] Setting SPAM_MAXLEN (Group: Cheating) deviates from its default value; clients older than 0.2.6.0 will not be allowed in.
[0] Connecting to Master Server 2...
[0] Killing user 0, ping 1.25.
[0] Login information sent. Waiting for reply...
[0] Got login denial...
[0] Logging out...
[0] Done!
[0] Killing user 0, ping 1.25.
[0] Login Succeeded. User Nr. 0 
[0] Syncing with server...
[0] Relabeling NetObjects...
[0] Syncing again...
[0] Done!
[0] Sending my server info...
[0] Closing socket bound to *.*.*.*:4534
[0] Bound socket to *.*.*.*:4534.
[0] Setting SPAM_MAXLEN (Group: Cheating) deviates from its default value; clients older than 0.2.6.0 will not be allowed in.
[0] Connecting to Master Server 3...
[0] Killing user 0, ping 1.25.
[0] Login information sent. Waiting for reply...
[0] Got login denial...
[0] Logging out...
[0] Done!
[0] Killing user 0, ping 1.25.
[0] Login Succeeded. User Nr. 0 
[0] Syncing with server...
[0] Relabeling NetObjects...
[0] Syncing again...
[0] Done!
[0] Sending my server info...
[0] Closing socket bound to *.*.*.*:4534
[0] Bound socket to *.*.*.*:4534.
[0] Setting SPAM_MAXLEN (Group: Cheating) deviates from its default value; clients older than 0.2.6.0 will not be allowed in.
[0] Connecting to Master Server 4...
[0] Killing user 0, ping 1.25.
[0] Login information sent. Waiting for reply...
[0] Got login denial...
[0] Logging out...
[0] Done!
[0] Killing user 0, ping 1.25.
[0] Login Succeeded. User Nr. 0 
[0] Syncing with server...
[0] Relabeling NetObjects...
[0] Syncing again...
[0] Done!
[0] Sending my server info...
[0] Closing socket bound to *.*.*.*:4534
[0] Bound socket to *.*.*.*:4534.
[0] Setting SPAM_MAXLEN (Group: Cheating) deviates from its default value; clients older than 0.2.6.0 will not be allowed in.
[0] Access level required to play changed from Program to OP-ed.
[0] Ping charity changed from 0 to 500.
[0] Nobody there. Taking a nap...
[0] Timestamp: 2011/07/02 13:29:47
[0] Closing socket bound to *.*.*.*:4534
[0] Bound socket to *.*.*.*:4534.
[0] Closing socket bound to *.*.*.*:56677
[0] Bound socket to *.*.*.*:56677.
[0] Closing socket bound to *.*.*.*:56677
[0] Bound socket to *.*.*.*:56677.
[0] Closing socket bound to *.*.*.*:56677
[0] Bound socket to *.*.*.*:56677.
[0] Closing socket bound to *.*.*.*:56677
[0] Bound socket to *.*.*.*:56677.
[0] Closing socket bound to *.*.*.*:56677
[0] Bound socket to *.*.*.*:56677.
[0] Closing socket bound to *.*.*.*:56677
[0] Bound socket to *.*.*.*:56677.
[0] Closing socket bound to *.*.*.*:56677
[0] Bound socket to *.*.*.*:56677.
[0] Closing socket bound to *.*.*.*:56677
[0] Bound socket to *.*.*.*:56677.
[0] Closing socket bound to *.*.*.*:56677
[0] Bound socket to *.*.*.*:56677.
[0] Closing socket bound to *.*.*.*:56677
[0] Bound socket to *.*.*.*:56677.
[0] Closing socket bound to *.*.*.*:56677
[0] Bound socket to *.*.*.*:56677.
[0] Closing socket bound to *.*.*.*:56677
[0] Bound socket to *.*.*.*:56677.
[0] Closing socket bound to *.*.*.*:56677
[0] Bound socket to *.*.*.*:56677.
then

Code: Select all

[0] User 0 timed out.
[0] Warning in void sn_DisconnectUser(int, const tOutput&, nServerInfoBase*) in /Users/manuel/Documents/Tron/Release/armagetronad-0.2.8.3.1/MacOS/../src/network/nNetwork.cpp:3286 : 
 	[0] Server tried to disconnect from itself.
until I quit.
, leaving behind a single black feather.
User avatar
Z-Man
God & Project Admin
Posts: 11587
Joined: Sun Jan 23, 2005 6:01 pm
Location: Cologne
Contact:

Re: Server Tries to Disconnect from Itself

Post by Z-Man »

The multiple reservation messages probably just come from you including the relevant configuration file several times over.

For the WARNING message, well, it's only an internal warning, nothing to worry about per se. We weeded out several 'Killing user 0" messages recently, which are related, but just to be safe and since your setup seems to be peculiar, could you create and attach a debug recording?
TechnoCat
On Lightcycle Grid
Posts: 31
Joined: Sat Mar 26, 2011 7:32 am

Reply: “Re: Server Tries to Disconnect from Itself”

Post by TechnoCat »

Well, no one can connect to the server (see the thread at http://bit.ly/mycRQu) , so I just start the server with the debug recording arguments, let it run for a few minutes, then quit and upload it?
, leaving behind a single black feather.
TechnoCat
On Lightcycle Grid
Posts: 31
Joined: Sat Mar 26, 2011 7:32 am

Debug Recording

Post by TechnoCat »

I started the server with the debug recording arguments, then quit it after…more than a few minutes actually. Anyway, I've attached the debug recording (and the Terminal scrollback, in case it's useful) .
Attachments
debug_recording_1_Terminal_output.txt
The Terminal scrollback from the server session that was recorded
(570.76 KiB) Downloaded 132 times
debug_recording_1.zip
The requested debug recording
(73.78 KiB) Downloaded 129 times
, leaving behind a single black feather.
TechnoCat
On Lightcycle Grid
Posts: 31
Joined: Sat Mar 26, 2011 7:32 am

Reply: “Re: Server Tries to Disconnect from Itself”

Post by TechnoCat »

Z-Man wrote:The multiple reservation messages probably just come from you including the relevant configuration file several times over.
I have it set up to include all the config files at the start of every round (everytime.cfg) just so I don't have to do it manually if and when I change something while the server's running (which should be all the time) . I don't think there's any self-referential include's, but I'll check.
, leaving behind a single black feather.
TechnoCat
On Lightcycle Grid
Posts: 31
Joined: Sat Mar 26, 2011 7:32 am

Self-Referential Include Statement

Post by TechnoCat »

In server_info.cfg:

Code: Select all

SINCLUDE server_info.cfg
There's also an include statement for settings_authentication.cfg , which is where I put all of the access level and screen name reservation statements. So that explains that. I'm glad someone had the foresight to put in include loop protection. I'll just remove that line.

By the way, what's the difference between INCLUDE and SINCLUDE ?
, leaving behind a single black feather.
User avatar
Z-Man
God & Project Admin
Posts: 11587
Joined: Sun Jan 23, 2005 6:01 pm
Location: Cologne
Contact:

Re: Server Tries to Disconnect from Itself

Post by Z-Man »

The S in SINCLUDE stands for silent. It won't print any 'file not found' error messages.

Interesting, we kind of don't have a recursion protection for includes... At least I don't remember anyone ever installed one and couldn't find one in the code. I wonder how this works at all. Looks like the limit of concurrently open files is hit and open requests fail at a certain depth. I'm not sure whether we want our own recursion protection, recursive includes may be useful when applied to non-regular files. I can't think of any use right now, of course.

The warning is indeed a hint at some internal problems. What happens is:
- the server logs into the master
- during the login process, it waits for an ID for the gGame object it created earlier
- instead, it gets a timeout connection abort message after 11 seconds
- even though the login failed, it is reported as success and even though the server has already been put into standalone state by the logout
- the server falsely continues to send the master its data, it probably goes nowhere
- the server is left in a stupid state because it tried to send over an already terminated connection.

The handling of that situation after the connection abort definitely is wrong, I'll fix that.

I'm not sure why the timeout error happens in the first place, it certainly is not a problem that usually arises. The master servers are happy to hand out IDs for game objects. And 11 seconds are not a time that can be caused by usual network delays.
Digging a bit deeper, your server does send the request for IDs, but it never is acknowledged by the masters. I suspect it doesn't get through to them. Since all masters are affected, something is probably borked on your side of the network and certain packets (based on their content; other packets from the same source to the same destination obviously get through) are simply dropped. According to the master logs, everything worked all right earlier (last time on July 2nd). Something you did since then may have broken things.
TechnoCat
On Lightcycle Grid
Posts: 31
Joined: Sat Mar 26, 2011 7:32 am

Config File Versions

Post by TechnoCat »

So everything (except server logins from the Outlands) was working last on the second of July? Should I restore a copy of the config files from then and upload them? In the meantime, I can use FileMerge to find differences in them, but I don't think I'll be much help there.
, leaving behind a single black feather.
TechnoCat
On Lightcycle Grid
Posts: 31
Joined: Sat Mar 26, 2011 7:32 am

File Versions

Post by TechnoCat »

Also, could you be more specific as to when my server was working correctly? I need to know which backup to restore from, and there are multiple from yesterday (2 July) .
, leaving behind a single black feather.
User avatar
Z-Man
God & Project Admin
Posts: 11587
Joined: Sun Jan 23, 2005 6:01 pm
Location: Cologne
Contact:

Re: Server Tries to Disconnect from Itself

Post by Z-Man »

Yeah, everything except pings seemed to work. Here's the last successful contact:

Code: Select all

[2011/07/02 05:10:32] [1] Received login from 76.203.173.136:54402 via socket 192.168.0.100:4533, network version: 0.2.8.3.X (ID: 16).
[2011/07/02 05:10:32] [1] New user: 1
[2011/07/02 05:10:39] [1] Received new server: 76.203.173.136:4534
[2011/07/02 05:10:39] [1] received logout from 1.
[2011/07/02 05:10:39] [1] Killing user 1, ping 0.572732, IP 76.203.173.136:54402: You logged out regularly.
[2011/07/02 05:10:41] [65] Updated server: 66.225.195.62:4537
Here's the first unsuccessful one:

Code: Select all

[2011/07/02 20:27:36] [1] Received login from 76.203.173.136:4534 via socket 192.168.0.100:4533, network version: 0.2.8.3.X (ID: 16).
[2011/07/02 20:27:36] [1] New user: 1
[2011/07/02 20:27:41] [2] Received login from 69.246.49.46:52141 via socket 192.168.0.100:4533, network version: 0.2.8.3.X (ID: 16).
[2011/07/02 20:27:41] [2] New user: 2
[2011/07/02 20:27:42] [2] Giving server info to user 2
[2011/07/02 20:27:46] [1] Killing user 1, ping 1.25, IP 76.203.173.136:4534: You timed out.
Times are GMT+3 for some reason.
Oh wait, that wasn't really the first unsuccessful attempt, just the first of the endless series of failures. About half the previous attempts failed, too. I'll just attach the full log grep.
Attachments
technocatlog.txt.bz2
log
(2.85 KiB) Downloaded 124 times
TechnoCat
On Lightcycle Grid
Posts: 31
Joined: Sat Mar 26, 2011 7:32 am

Previous Revisions of Config Files

Post by TechnoCat »

Well, these are all from various backups, but they're the earliest on or the latest before the second. I'll look through them and see if there's anything I think might be affecting it, but like I said, I'll not be much help in this area. I could try again with WireShark if you tell me what to do. XD
Attachments
config_restored.zip
Restored versions of the configuration files.
(35.91 KiB) Downloaded 113 times
config_current.zip
The current versions of the configuration files.
(36.11 KiB) Downloaded 128 times
, leaving behind a single black feather.
TechnoCat
On Lightcycle Grid
Posts: 31
Joined: Sat Mar 26, 2011 7:32 am

Log and Compare Results

Post by TechnoCat »

Looking at the log, I see logins, logouts, and timeouts, but I don't see my server being acknowledged at all. That's not complete, though, (right?) so could you search through the full log for an acknowledgement of a server called “TronServ (Lightcycle Battle) ” ?

aiplayers.cfg , autoexec.cfg , and settings_custom.cfg haven't changed since the second.
everytime.cfg previously had an include and sinclude for settings_authentication.cfg but that's the only difference.
server_info.cfg previously had an sinclude for settings_authentication.cfg (now it's an include) and one for server_info.cfg (which was removed) . Also, I've changed ROUND_CONSOLE_MESSAGE and removed a line turning on global IDs (that's done in another file) .
settings_authentication.cfg previously had “REVERVE_SCREEN_NAME” statements. That was a typo, and it was fixed.

As far as I know, none of those changes should affect my server's ability to connect, sync, or reply to pings.
, leaving behind a single black feather.
User avatar
Z-Man
God & Project Admin
Posts: 11587
Joined: Sun Jan 23, 2005 6:01 pm
Location: Cologne
Contact:

Re: Server Tries to Disconnect from Itself

Post by Z-Man »

Unfortunately, nothing in the arma config can cause the observed type of packet loss. Or any other kind of packet loss, actually. It must be on the network somewhere.

And yeah, the server doesn't get acknowledged. That happens through the same pings clients use on the server browser and doesn't work for the same reason as them, whatever that is.
TechnoCat
On Lightcycle Grid
Posts: 31
Joined: Sat Mar 26, 2011 7:32 am

Tracking Down Point of Packet Loss

Post by TechnoCat »

I don't suppose you have access to one of the master servers? I was thinking you could try ‘traceroute -e -P UDP -p 4534 76.203.173.136’ from a master server and see how far it gets. Just to make sure the packets aren't being lost before they get to my LAN.

On second thought, it doesn't even have to be a master server, does it? It could be any computer outside of my LAN. Right?
, leaving behind a single black feather.
User avatar
Z-Man
God & Project Admin
Posts: 11587
Joined: Sun Jan 23, 2005 6:01 pm
Location: Cologne
Contact:

Re: Server Tries to Disconnect from Itself

Post by Z-Man »

Yep, I have access to two masters, but any source PC will do.

For some reason, -P UDP only gives stars, it's probably too raw. But UDP is the default protocol anyway. From my home PC, on the same LAN as my first master:

Code: Select all

$ sudo traceroute -e -U -p 4534 76.203.173.136
traceroute to 76.203.173.136 (76.203.173.136), 30 hops max, 60 byte packets
 1  10.74.192.1 (10.74.192.1)  15.842 ms  16.228 ms  16.676 ms
 2  1311G-MX960-02-ae10.huerth.unity-media.net (80.69.102.197)  17.219 ms  17.694 ms  17.987 ms
 3  13NOC-MX960-ae7.kerpen.unity-media.net (80.69.107.37) <MPLS:L=641027,E=0,S=1,T=1>  20.176 ms  20.554 ms  20.958 ms
 4  1411G-MX960-02-ae9.neuss.unity-media.net (80.69.107.6)  21.782 ms  21.282 ms  22.126 ms
 5  nl-ams05a-rd2-ae-2.aorta.net (84.116.131.141)  35.983 ms  36.466 ms  36.963 ms
 6  us-nyc01b-rd1-gi-4-0-0.aorta.net (84.116.130.30) <MPLS:L=1738,E=0,S=1,T=1>  112.736 ms us-nyc01b-rd1-gi-2-0-0.aorta.net (84.116.130.26) <MPLS:L=1738,E=0,S=1,T=1>  97.133 ms us-nyc01b-rd1-gi-4-0-0.aorta.net (84.116.130.30) <MPLS:L=1738,E=0,S=1,T=1>  97.267 ms
 7  us-nyc01b-ri1-xe-3-1-0.aorta.net (213.46.190.94)  101.050 ms  105.252 ms  105.739 ms
 8  xe-8-3-0.edge1.NewYork1.Level3.net (4.78.164.221)  114.047 ms  114.461 ms xe-4-1-0.edge1.NewYork1.Level3.net (4.78.132.93)  106.253 ms
 9  ae-3-89.edge3.NewYork1.Level3.net (4.68.16.145)  107.593 ms ae-2-79.edge3.NewYork1.Level3.net (4.68.16.81)  107.081 ms  106.603 ms
10  att-level3.newyork1.level3.net (4.68.63.142)  112.662 ms  112.070 ms  113.035 ms
11  cr1.n54ny.ip.att.net (12.122.81.126) <MPLS:L=16250,E=0,S=1,T=255>  186.974 ms  184.585 ms  175.903 ms
12  cr2.cgcil.ip.att.net (12.122.1.2) <MPLS:L=23101,E=0,S=0,T=255/L=17195,E=0,S=1,T=255>  169.796 ms  168.991 ms  172.291 ms
13  cr1.cgcil.ip.att.net (12.122.2.53) <MPLS:L=23567,E=0,S=0,T=255/L=17195,E=0,S=1,T=255>  172.844 ms  175.772 ms  179.063 ms
14  cr1.sffca.ip.att.net (12.122.4.121) <MPLS:L=0,E=0,S=0,T=255/L=17195,E=0,S=1,T=255>  179.837 ms  180.380 ms  169.541 ms
15  12.123.155.117 (12.123.155.117)  177.468 ms  177.275 ms  177.497 ms
16  151.164.54.110 (151.164.54.110)  177.353 ms  172.355 ms  167.346 ms
17  dist4-g9-2.pltnca.sbcglobal.net (151.164.42.103)  169.918 ms  169.618 ms  172.230 ms
18  bras36-g9-0.pltnca.sbcglobal.net (64.164.107.15)  174.545 ms  169.661 ms  169.231 ms
19  * * *
... (only more stars)

From my other master, which doesn't understand -e:

Code: Select all

root@vs187011:~# traceroute -U -p 4534 76.203.173.136
traceroute to 76.203.173.136 (76.203.173.136), 30 hops max, 40 byte packets
 1  static-ip-85-25-32-37.inaddr.intergenia.de (85.25.32.37)  0.072 ms  0.029 ms  0.028 ms
 2  217.118.16.25 (217.118.16.25)  3.221 ms  3.178 ms  3.138 ms
 3  194.25.211.53 (194.25.211.53)  3.097 ms  3.102 ms  3.137 ms
 4  217.239.40.78 (217.239.40.78)  86.437 ms  86.393 ms  86.354 ms
 5  192.205.34.149 (192.205.34.149)  84.846 ms  84.815 ms  84.859 ms
 6  cr1.n54ny.ip.att.net (12.122.80.226)  155.696 ms  155.820 ms  155.425 ms
 7  cr2.cgcil.ip.att.net (12.122.1.2)  155.927 ms  155.587 ms  155.504 ms
 8  cr1.cgcil.ip.att.net (12.122.2.53)  156.287 ms  156.245 ms  156.199 ms
 9  cr1.sffca.ip.att.net (12.122.4.121)  155.181 ms  155.141 ms  155.097 ms
10  12.123.155.117 (12.123.155.117)  154.678 ms  154.642 ms  154.646 ms
11  151.164.54.98 (151.164.54.98)  315.058 ms  166.059 ms  166.041 ms
12  151.164.93.230 (151.164.93.230)  154.370 ms *  154.392 ms
13  bras36-g9-0.pltnca.sbcglobal.net (64.164.107.15)  156.985 ms  157.700 ms  157.809 ms
14  * * *
...
For reference, the same with my first master as destination:

Code: Select all

root@vs187011:~# traceroute -U -p 4534 moosnet.homelinux.net
traceroute to moosnet.homelinux.net (62.143.245.70), 30 hops max, 40 byte packets
 1  static-ip-85-25-32-37.inaddr.intergenia.de (85.25.32.37)  0.077 ms  0.025 ms  0.023 ms
 2  * * *
 3  1411G-MX960-01-ae10-300.neuss.unity-media.net (194.146.118.35)  7.482 ms  7.470 ms  7.439 ms
 4  1211F-MX960-01-ae8.dortmund.unity-media.net (80.69.107.10)  9.357 ms  9.332 ms  9.304 ms
 5  1311G-MX960-01-ae9.huerth.unity-media.net (80.69.107.41)  10.886 ms  10.862 ms  10.862 ms
 6  PH-1311G-uBR10k-01-Po1.huerth.unity-media.net (80.69.102.194)  9.717 ms  10.075 ms  10.023 ms
 7  ip-62-143-245-70.unitymediagroup.de (62.143.245.70)  23.700 ms  27.426 ms  27.391 ms
That one goes through.

You should probably traceroute out and see which hop 64.164.107.15 is from your end. I'd guess it's the first outside your LAN.
Post Reply