[Time] Name | Message |
[06:19] Bakafish
|
Hello group. I was hacking on the new "Butterfly" Example code and have an issue where I successfully send a message over an ipc channel but the blocked read side never seems to see it.
|
[06:26] Bakafish
|
I'm referring to the final 'sync' message indicating that all the test packets were received. The test packets are going across a similar channel just fine. But the final sync packet to trigger the end of the timer is sent and the "Sending" process never seems to see it.
|
[06:30] sustrik
|
Bakafish: does the packet cross the network?
|
[06:30] sustrik
|
mikko: re
|
[06:31] Bakafish
|
It's using ipc (Pipes) so, no :-)
|
[06:31] sustrik
|
ah
|
[06:33] Bakafish
|
Is it a situation where the packet is below some threshold so it's being cached in a queue? I'm reaching for straws here.
|
[06:34] sustrik
|
Bakafish: no
|
[06:34] sustrik
|
it should be passed immediately
|
[06:34] sustrik
|
so all the components are running on the same box
|
[06:35] sustrik
|
using IPC for communication
|
[06:35] sustrik
|
right?
|
[06:35] Bakafish
|
I'm getting a good return on the send side. The blocking read blocks correctly, it just never sees the packet.
|
[06:35] Bakafish
|
That's correct.
|
[06:36] Bakafish
|
The other, seemingly identical connections are behaving correctly.
|
[06:36] sustrik
|
is the IPC filename same on the send and recv side?
|
[06:36] sustrik
|
(sanity check)
|
[06:38] Bakafish
|
:-) I checked for that, and low and behold, one arg was ipc:///tmp/zmq/send_sync the other ipc://tmp/zmq/send_sync
|
[06:38] Bakafish
|
Way to introduce myself to the group :-(
|
[06:39] Bakafish
|
I was ls ing my tmp directory and saw the 3 expected files, didn't think to check for a missing slash. Thanks for the help!
|
[06:40] sustrik
|
no problem :)
|
[07:29] Bakafish
|
Okay, a question probably rivaling the level of stupidity of my prior one. In the case of more than one agent connecting to a ZMQ_DOWNSTREAM it round robins great, but if an agent goes away it doesn't seem to be aware and packets are seemingly queued or lost for that agent. How do you access the queue registry and manipulate it?
|
[07:43] sustrik
|
Bakafish: the messages already delivered to an application are lost once the application crashes
|
[07:43] sustrik
|
you can limit the damage
|
[07:43] sustrik
|
by setting HWM socket option
|
[07:43] sustrik
|
which specifies how much messages may be queued in any given moment
|
[07:46] Bakafish
|
That's fair. But it's not what I mean. If I launch the 'Sender' with two 'processor' nodes and one 'results' node and run 1000 checks, everything is fine. 500 each. Then I kill one processor node, so there is only a single node in the pipeline. The queue still thinks there are two (timeout not met yet?) and a second run will result in 500 checks sent to the single processor instance.
|
[07:47] sustrik
|
let me see...
|
[07:48] sustrik
|
hm, i am not sure how UNIX domain sockets handle application crashes
|
[07:48] sustrik
|
maybe it takes some time to notify the sender that the receiver is dead?
|
[07:49] sustrik
|
can you try sleeping for a second after killing the app
|
[07:49] sustrik
|
?
|
[07:49] Bakafish
|
If I have to write my own code to see if a process is alive or not that's fine, but how do I identify the processor nodes attached to a paticular queue?
|
[07:50] Bakafish
|
The server is dispatching these checks and so it has some knowledge of how many there are I'd assume.
|
[07:50] Bakafish
|
Ahh, this isn't a problem with TCP?
|
[07:51] Bakafish
|
Meaning, if I was using TCP instead of sockets I shouldn't expect this behavior?
|
[07:51] sustrik
|
well, i am not sure what's happening, however, the obvious explanation would be that there's an delay between receiver dying and sender being notified about the fact
|
[07:52] Bakafish
|
sockets, bah. Pipes
|
[07:52] sustrik
|
in the meantime the mesages will be dispatched to that receiver
|
[07:52] sustrik
|
it applies to any communication channel (TCP, IPC, ...)
|
[07:53] Bakafish
|
I see. I will check to see if it eventually times out. Is there a setting (I recall someone mentioning there was.)
|
[07:53] sustrik
|
nope, it's dependent on IPC implementation
|
[07:53] sustrik
|
there's probably a timeout in the kernel somewhere
|
[07:56] Bakafish
|
Ahhh the HWM socket option was what was discussed. I think someone was saying you could limit it but that there were all sorts of places where it could be queued along the path and reducing the number of messages in the queue had performance issues.
|
[07:58] Bakafish
|
Let me play with it some more since I was using local pipes for convenience, I will be using TCP in production so it may behave better.
|
[13:34] mikko
|
sustrik: did you notice the issue i opened at github over the hols?
|
[13:47] sustrik
|
mikko: yes, i did
|
[14:09] mikko
|
cool
|
[14:09] mikko
|
i got zmq_poll soon implemented
|
[14:09] mikko
|
hopefully
|
[14:15] sustrik
|
mikko: seen you've did ~20 commits a day during the holiday :)
|
[14:21] mikko
|
open source is made by night
|
[14:21] mikko
|
:)
|
[14:21] mikko
|
at least in my case
|
[14:28] sustrik
|
mikko: i've linked the PHP wiki page from the main page
|
[14:28] mikko
|
sustrik: thanks
|
[14:28] sustrik
|
you should announce availability of the binding on the mailing list
|
[14:29] mikko
|
sustrik: i'll announce as soon as the api stabilizes
|
[14:29] sustrik
|
goodo
|
[14:29] mikko
|
still need to add polling support
|
[14:29] sustrik
|
sure
|
[14:29] mikko
|
after that it's fairly complete
|
[14:29] mikko
|
some of the tests are failing due to the zeromq2 / issue #12
|
[14:32] sustrik
|
i've reproduced the problem
|
[14:32] sustrik
|
trying to fix it...
|
[14:32] mikko
|
different errors with different socket types
|
[14:32] sustrik
|
yes
|
[14:37] mikko
|
i'm gonna take a short nap, been touring around berlin all day
|
[14:37] mikko
|
laters
|
[14:52] sustrik
|
see you
|