[Time] Name | Message |
[08:26] sustrik
|
Guthur: what's pipe ops?
|
[08:33] CIA-21
|
zeromq2: 03Michael Compton 07master * rfbe5d85 10/ (AUTHORS doc/zmq_setsockopt.txt):
|
[08:33] CIA-21
|
zeromq2: Added note regarding setting sockopt before bind/connect
|
[08:33] CIA-21
|
zeromq2: Signed-off-by: Michael Compton <michael.compton@littleedge.co.uk> - http://bit.ly/fNGMan
|
[11:22] cyball
|
hi can i do something like this :: REQ -> XREP->XREQ->SUB with zmq_device::ZMQ_FORWARDER ? i want to publish a message to morre than one subscribers :) thx
|
[11:23] cyball
|
or is it also possible to work with ... REQ->PUB->SUB ?
|
[11:31] mikko
|
cyball: do you need req sock on client?
|
[11:32] mikko
|
cyball: you could have PUB on client and use forwarder device
|
[11:32] mikko
|
client PUB <---> SUB forwarder PUB <---> SUB subscribers
|
[11:34] cyball
|
mikko, i work on a continous integration service and i got a post-commit from github so i can not run the PUB whole the time only on request from github :)
|
[11:35] mikko
|
cyball: you can have the pub connect to forwarder on post-commit ?
|
[11:35] mikko
|
connect, publish, go away
|
[11:36] cyball
|
mikko, ohh ok .. i thought that i have to run the publisher whole the time
|
[11:36] cyball
|
because of the subscriibers
|
[11:36] mikko
|
the subscribers would be connected to forwarder device
|
[11:36] mikko
|
whcih can run all the time
|
[11:37] cyball
|
ok
|
[11:37] mikko
|
look at the diagram:
|
[11:37] mikko
|
client PUB <---> SUB forwarder PUB <---> SUB subscribers
|
[11:37] mikko
|
:)
|
[11:38] cyball
|
mikko, thx
|
[11:38] mikko
|
forwarder would run all the time and subscribers would only know that they are connected to it
|
[11:38] cyball
|
ok
|
[11:38] cyball
|
i will have a look on it
|
[11:38] cyball
|
do you have a link to it ?
|
[11:39] mikko
|
link to where?
|
[11:39] cyball
|
diagrag :)
|
[11:39] cyball
|
diagram :)
|
[11:39] mikko
|
it's that ascii one
|
[11:39] mikko
|
let me see if zguide has similar with prettier graphics
|
[11:39] cyball
|
no sorry i thought there is one in the manual i did not seen
|
[11:41] mikko
|
it's a very simple scenario
|
[11:41] mikko
|
the publisher client connects to insocket on forwarder and publishes message
|
[11:41] mikko
|
forwarder then publishes it on outsocket
|
[11:42] mikko
|
and subscribers are connected to outsocket
|
[11:42] mikko
|
forwarder runs on the background and subscribers only know about it's existence
|
[11:42] mikko
|
publishers on insocket can come and go as they please
|
[11:42] cyball
|
mikko, is that ok ? http://pastebin.com/1USs18EG
|
[11:45] mikko
|
cyball: subscribe to "" on frontend
|
[11:46] mikko
|
so before the frontend bind
|
[11:47] mikko
|
zmq_setsockopt(frontend, "", 0);
|
[11:47] mikko
|
otherwise your frontend will filter all messages
|
[12:03] cyball
|
mikko, http://pastebin.com/yfHkwBv1 is that ok for the publisher on the client ?
|
[12:04] cyball
|
or should i also add some socket options too ?
|
[12:04] mikko
|
should be fine
|
[12:19] cyball
|
mikko, ok i have put all pieces together :: http://pastebin.com/EGUTh9mY i do not see anything on the subscriber i guess there is something i do not see can u have pls a look on it ?
|
[12:36] mikko
|
cyball: subscribe the sub
|
[12:36] mikko
|
subscriber.setsockopt(ZMQ_SUBSCRIBE, "", 0);
|
[12:36] mikko
|
otherwise it will filter all messages
|
[12:36] mikko
|
i gotta commute to the office
|
[12:36] mikko
|
back in 30 mins or so
|
[12:37] cyball
|
thx
|
[12:42] cyball
|
it does not work :-(
|
[12:47] cyball
|
sure that i can have a SUB bind on a port and it does not only support connect ?
|
[12:51] sustrik
|
yes, bind/connect are orthogonal to the socket type
|
[12:55] cyball
|
sustrik, thx
|
[12:57] cyball
|
sustrik, can you please have a look on the code ... ? http://pastebin.com/EGUTh9mY
|
[12:57] cyball
|
probably there is something missing ... i have adde the subscriber.setsockopt(ZMQ_SUBSCRIBE, "", 0);
|
[12:57] cyball
|
but it also does not work
|
[12:58] sustrik
|
what version of 0mq are you using?
|
[12:59] cyball
|
2.0.10
|
[13:00] sustrik
|
with 2.0.10 there's no blocking zmq_close()
|
[13:00] cyball
|
ohhh upps
|
[13:00] sustrik
|
thus, if there are any queued outbound messages
|
[13:00] sustrik
|
they are dropped on zmq_close()
|
[13:00] cyball
|
ok that means i should compile the beta ?
|
[13:00] sustrik
|
if you need to block till the messages are send, you'll have to use 2.1.0
|
[13:01] sustrik
|
yup
|
[13:01] cyball
|
ok i will do it now :)
|
[13:06] cyball
|
sustrik, yeahhh it works now THX
|
[13:06] sustrik
|
np
|
[13:10] marcinkuzminski
|
Hi, I'm planning a system to do concurrent database insert, that suppose to be fail safe when insert fails, it would retry/remember, can that all be achieved using zeromq ?
|
[13:11] sustrik
|
what do you mean by fail-safe?
|
[13:12] marcinkuzminski
|
sustrik, That i cannot allow a task to be lost.
|
[13:12] marcinkuzminski
|
i run ~200-300 tasks/s
|
[13:13] marcinkuzminski
|
let say one out of million fails due to any reason network/db. And i need to retry that task with delay few times, if it fails permanently i need to store and remember that.
|
[13:14] sustrik
|
then you have to store the task into a database
|
[13:14] sustrik
|
distributed transactions should be used to pair the task generation with the insertion
|
[13:14] sustrik
|
so that either both fail or both succeed
|
[13:14] sustrik
|
it has very little to do with 0mq
|
[13:15] marcinkuzminski
|
right,
|
[13:16] marcinkuzminski
|
so it's just about distrubution of task that zeromq does ?
|
[13:16] marcinkuzminski
|
sustrik, ok, getting back to reading 0mq manual
|
[13:16] sustrik
|
right
|
[13:19] Guthur
|
sustrik: did you catch my messages last night
|
[13:19] Guthur
|
you can ignore the first suggestion though
|
[13:19] Guthur
|
I now think the second would be better
|
[13:22] sustrik
|
Guthur: i missed it
|
[13:22] sustrik
|
can you explain once more?
|
[13:30] Guthur
|
basically would it be ok for 0MQ to pass in a custom OVERLAPPED struct with an operation type flag when performing various pipe operations. iocp_t (epoll etc equivalent) would then use this along with poll_entry, which would be set as the completion key, to call to determine which event handlers to call
|
[13:30] Guthur
|
I also have a question regarding the retired functionality, but I'm not near the code at the moment and can not remember the details
|
[13:31] Guthur
|
does that make any sense?
|
[13:34] sustrik
|
Guthur: pass what from where to where?
|
[13:35] sustrik
|
sorry, i don't follow
|
[13:36] sustrik
|
afaics, the OVERLAPPED should be part of poll_entry
|
[13:36] sustrik
|
actyually 2 of them
|
[13:36] sustrik
|
one for write another one for read
|
[13:37] Guthur
|
when doing ReadFile or WriteFile or ConnectToPipe there is an argument for the OVERLAPPED, this will be returned through the IOCP when the op completes
|
[13:38] sustrik
|
the event in the OVERLAPPED will be signaled when the op completes, right?
|
[13:39] Guthur
|
you can ignore the event field, not using it
|
[13:39] sustrik
|
hm
|
[13:39] Guthur
|
but the OVERLAPPED can be extended, so that it contains custom fields
|
[13:39] sustrik
|
how are you notified about operation being exectured then?
|
[13:39] sustrik
|
executed*
|
[13:40] Guthur
|
the only way is by setting a field in the custom OVERLAPPED struct
|
[13:40] sustrik
|
i mean, is it a callback or what?
|
[13:41] Guthur
|
IOCP will return an array of OVERLAPPED_ENTRIES, these will contain that the OVERLAPPED passed in when starting the OP
|
[13:41] sustrik
|
which function is that?
|
[13:42] sustrik
|
the one that returns the ENTRIES
|
[13:42] Guthur
|
it will also return the Completion Key which you specify when you add the handle to the IOCP
|
[13:42] Guthur
|
GetQueueCompletionStatus
|
[13:42] Guthur
|
GetQueueCompletionStatusEx actually (for the timeout)
|
[13:45] Guthur
|
I'm going to grab a coffee, back in a mo
|
[13:46] sustrik
|
checking the docs
|
[13:46] sustrik
|
how do you associate a particular read/write request with a specific completion port?
|
[13:48] zchrish
|
I am testing zmq::poll with a single entry pollitem_t list. After I send a packet, I check whether items[0].revents & ZMQ_POLLIN is "1" and then process the input from REP. But, in my case, it always seems to be set to "1" even though I purposely put a 5 second delay in my XEQ program. Am I doing something wrong?
|
[13:50] sustrik
|
zchrish: so you get POLLIN even though there is no message available, right?
|
[13:51] zchrish
|
I think so.
|
[13:51] sustrik
|
write a minimal test case then and report it as a bug
|
[13:51] sustrik
|
POLLIN should be signaled only if there's a message available for reading
|
[13:52] zchrish
|
ok; let me test a minimal case.
|
[13:52] mikko
|
pieterh_: i've been ripping the guts out from zfl builds
|
[13:53] pieterh
|
mikko: nice, I think... :-)
|
[13:53] mikko
|
there was quite a lot of things that weren't seem to be needed
|
[13:53] mikko
|
like checks for C++ compiler, atomic ops, linking against socket libs etc
|
[13:53] mikko
|
tested linux, mingw32 and mac os x this far
|
[13:54] mikko
|
solaris and freebsd to go from platforms i have access to
|
[13:54] pieterh
|
sounds great
|
[13:54] mikko
|
also, make check now runs zfl_selftest
|
[13:54] pieterh
|
is that the normal action, there's no "make test"?
|
[13:55] mikko
|
make check seems to be default action
|
[13:55] pieterh
|
how about I give you commit access to the git?
|
[13:55] pieterh
|
that seems simpler than pull requests
|
[13:55] pieterh
|
are you committer on zmq?
|
[13:56] mikko
|
no, im not
|
[13:56] pieterh
|
how would you like to work? I'm happy giving you commit access
|
[13:56] mikko
|
well, let's see whether you agree my thinking here:
|
[13:57] mikko
|
my thinking was to rip out as much as possible to make things maintainable and then fix per platform if there are bugs on let's say very old qnx
|
[13:57] mikko
|
example:
|
[13:58] mikko
|
mac os x was set to build without -pedantic even though ZFL builds fine with -pedantic on mac os x
|
[13:59] pieterh
|
well, you are far more expert in this than me
|
[13:59] mikko
|
sparc cpu optimization:
|
[13:59] mikko
|
-mcpu=v9
|
[13:59] mikko
|
is that really needed?
|
[13:59] pieterh
|
:-)
|
[13:59] pieterh
|
I hope you're not asking me
|
[13:59] mikko
|
i am
|
[13:59] mikko
|
it's in the build :)
|
[14:00] pieterh
|
well, mikko, my process is kind of different
|
[14:00] pieterh
|
copy some code, hack it till it works, forget about it again asap, wait for patches
|
[14:00] mikko
|
i can agree with that
|
[14:00] pieterh
|
specifically, for the tooling, which I don't want to be expert in
|
[14:01] pieterh
|
so most of what is there I copied, and left unchanged because it didn't break things
|
[14:02] pieterh
|
clearly someone who knows their stuff, like you, would rip most of it out
|
[14:02] pieterh
|
which is perfect
|
[14:03] pieterh
|
what's your github id?
|
[14:03] pieterh
|
plain mikko?
|
[14:03] mikko
|
mkoppanen
|
[14:04] Guthur
|
sustrik: the completion key will take care of that
|
[14:04] pieterh
|
ok, mikko, you are now committer on zfl
|
[14:04] Guthur
|
it is returned as part of the OVERLAPPED_ENTRY struct
|
[14:04] mikko
|
cool
|
[14:04] mikko
|
i'll hopefully finish this during this week
|
[14:04] Guthur
|
sustrik: oh wait I miss read
|
[14:04] pieterh
|
:-) I'm enormously grateful...
|
[14:05] mikko
|
got company evening today so might be a bit out of game tomorrow
|
[14:05] Guthur
|
I think if you had multiple IOCP it would be returned to all
|
[14:05] pieterh
|
suddenly zfl actually builds properly across more than Ubuntu :-)
|
[14:05] Guthur
|
That's a guess though
|
[14:05] sustrik
|
what would that be good for?
|
[14:05] sustrik
|
strange
|
[14:06] Guthur
|
sustrik: that comment for me?
|
[14:06] sustrik
|
yep
|
[14:06] sustrik
|
if every event is passed to all completion ports
|
[14:06] Guthur
|
I have to admit I never really considered multiple IOCP
|
[14:06] sustrik
|
what's the point of having many of them
|
[14:07] Guthur
|
you associate the pipe handle with the IOCP
|
[14:07] sustrik
|
ah
|
[14:07] sustrik
|
how do you do that?
|
[14:07] Guthur
|
you can associate multiple pipes with the IOCP
|
[14:07] zchrish
|
so I added code to C++ example from "the guide" to hwclient.cpp and hwserver.cpp to include a single pollitem_t entry. I included a 5 second delay in the hwserver.cpp. It appears the code always enters regardless.
|
[14:07] Guthur
|
CreateIoCompletionPort
|
[14:07] Guthur
|
sustrik: ^^
|
[14:08] Guthur
|
http://msdn.microsoft.com/en-us/library/aa363862(v=VS.85).aspx
|
[14:08] pieterh
|
zchrish: could you post the code in a pastebin somewhere?
|
[14:08] zchrish
|
Sure, that's next.
|
[14:08] sustrik
|
Guthur: ok, i see
|
[14:09] Guthur
|
sustrik: so, can we make it fit?
|
[14:09] sustrik
|
i think i have a vague idea of how it works now :)
|
[14:09] mikko
|
isnt ZMQ_SUBSCRIBE an exception to setsockopts before connect/bind?
|
[14:09] sustrik
|
Guthur: there are few things to keep in mind
|
[14:09] sustrik
|
mikko: no
|
[14:09] pieterh
|
sustrik: really?
|
[14:10] mikko
|
then i found an error on zguide :)
|
[14:10] sustrik
|
yes, ZMQ_SUBSCRIBE applies to the socket as a whole
|
[14:10] pieterh
|
you can't add/remove a filter after binding?
|
[14:10] Guthur
|
i think you can
|
[14:10] sustrik
|
as opposed to other sockopts that apply only to subsequent connects/binds
|
[14:10] sustrik
|
pieterh_: yes
|
[14:10] pieterh
|
sorry, the 'no' part confused us, I think
|
[14:11] sustrik
|
Guthur: namely, that the messages should be kept in 0mq as long as possible
|
[14:11] sustrik
|
thus instead of starting asynch writes immediately for any data to send
|
[14:12] sustrik
|
you should start async send
|
[14:12] sustrik
|
wait while it compeltes
|
[14:12] sustrik
|
then start next send
|
[14:12] sustrik
|
etc.
|
[14:12] sustrik
|
the rationale is that if you pushed all the data to the kernel immediately
|
[14:13] Guthur
|
sustrik: and does 0MQ use the poll to do that subsequent send?
|
[14:13] sustrik
|
0mq flow control (such as HWM) won't work
|
[14:13] sustrik
|
Guthur: all existing polling mechanisms are using sync sends
|
[14:13] sustrik
|
i.e. they poll for pollout
|
[14:14] sustrik
|
pollout is signaled if there's a space free in the kernel buffer
|
[14:14] sustrik
|
then it sends the data
|
[14:14] Guthur
|
which should be the same as an IOCP completion status signaled for a write op, correct?
|
[14:14] sustrik
|
yes
|
[14:15] sustrik
|
it should be same
|
[14:15] sustrik
|
except that IOCP itself is different from poll
|
[14:15] sustrik
|
so it'll be a bit complex
|
[14:15] sustrik
|
but the semantics should be the same, yes
|
[14:15] Guthur
|
yep, we need to send some op identifying data in the OVERLAPPED,
|
[14:16] zchrish
|
OK; here is the snipper - https://gist.github.com/827576
|
[14:16] Guthur
|
it's the only way we will know what the completion status is being returned for
|
[14:16] sustrik
|
even better: we can have a single IOCP per poller
|
[14:17] Guthur
|
but IOCP will return for all ops
|
[14:17] sustrik
|
yes, but we can place custom data to the result, right?
|
[14:17] Guthur
|
via a custom OVERLAPPED sure
|
[14:17] pieterh
|
zchrish: what does the client program print?
|
[14:18] sustrik
|
then we can identify the socket there as well as operation being performed
|
[14:18] sustrik
|
something like:
|
[14:18] sustrik
|
{
|
[14:18] sustrik
|
HANDLE socket;
|
[14:18] sustrik
|
bool read_or_write;
|
[14:18] sustrik
|
}
|
[14:18] zchrish
|
Just the normal case : Received reply 0: [World]
|
[14:18] Guthur
|
struct exactly that
|
[14:19] Guthur
|
well roughly actually
|
[14:19] Guthur
|
you don't need the Handle though, we can set the completion key to that
|
[14:19] Guthur
|
it will be returned for all completion status' for that socket
|
[14:19] zchrish
|
Never goes to "WAITING..."
|
[14:19] pieterh
|
zchrish: you are doing an infinite timeout in zmq::poll, what else would you expect to see?
|
[14:19] sustrik
|
Guthur: ok
|
[14:20] zchrish
|
Sorry, yes.
|
[14:20] pieterh
|
zchrish: note that timeout is in usec (so use 1000000 for 1 second)
|
[14:21] Guthur
|
sustrik: a bool wont cut it though, I actually intended to use an enum. There is at least 3 ops that take an overlapped ConnectNamedPipe, Read, write
|
[14:22] Guthur
|
so the connect will also be returning to the IOCP
|
[14:25] zchrish
|
So if I want to wrap my code like this, it seems like I should enter "0" and then perform the sleep myself. Is that permissible? Seems so.
|
[14:27] pieterh
|
zchrish: did you get it working as expected?
|
[14:27] pieterh
|
e.g. using a zmq_poll timeout of 1 second...
|
[14:28] pieterh
|
doing the sleep outside zmq_poll is a bad design
|
[14:29] pieterh
|
imagine your server replies after 1000 usec
|
[14:29] pieterh
|
you client won't get the response until after 1 full second
|
[14:30] zchrish
|
yes, it works. Thank you. I used "0". I agree with your assessment.
|
[14:33] sustrik
|
Guthur: right
|
[14:36] Guthur
|
cool, I think we are making progress with this. I have some rough code at home, hopefully we can go over it sometime soon
|
[14:44] zchrish
|
pieterh: So I am playing around with different ways to detect errors in network traffic flow and the one that I seem to feel most comfortable with is the concept of a watchdog thread that monitors socket states. I put my 0mq socket into a thread and that thread doesn't reset the state, an alarm goes off. This is the best method I have learned thus far. If there are better ways that you are willing to share, please do so. Thank
|
[14:44] zchrish
|
you.
|
[14:44] pieterh
|
zchrish: you cannot share a socket between threads, remember
|
[14:45] pieterh
|
in general you need specific algorithms for different kinds of failure
|
[14:45] zchrish
|
pieterh: No, I have a variable that represents the thread state and the thread is responsible for updating that state on a timely basis.
|
[14:45] mikko
|
zchrish: what does that actually monitor?
|
[14:45] pieterh
|
you cannot share state between threads either
|
[14:45] mikko
|
that the thread is not stuck blocking?
|
[14:47] zchrish
|
Well the idea is to try to verify that the routine is cycling through its while (true) state which I have defined to do so every "x" cycles of time. I want to ensure this is the case.
|
[14:49] zchrish
|
pieterh: I am referring to "state" in a non-zeromq sense.
|
[14:49] pieterh
|
zchrish: you are IMO misusing threads quite fundamentally
|
[14:49] pieterh
|
each thread should be entirely isolated in terms of state, meaning memory
|
[14:49] pieterh
|
threads should communicate only by sending each other messages
|
[14:50] pieterh
|
threads should process a set of sockets that they own fully
|
[14:51] pieterh
|
the only object in ZMQ that's safe to share between threads is the context
|
[14:51] zchrish
|
Thank you for your feedback; I will think...
|
[14:53] stimpie
|
zchrish, you could have all your treads send a 'variable' to your watchdog thread using messages
|
[15:44] sustrik
|
Guthur: still there?
|
[15:44] Guthur
|
sure
|
[15:44] sustrik
|
there's one problem with IOCP i haven't realised
|
[15:45] sustrik
|
namely: how to implement zmq_poll()
|
[15:45] sustrik
|
?
|
[15:45] sustrik
|
given that fd_t will be HANDLE instead of SOCKET
|
[15:45] sustrik
|
we can't use select() to simulate the polling
|
[15:45] Guthur
|
yeah, that's something I meant to be asking you
|
[15:48] Guthur
|
sustrik: though you can use SOCKETS with IOCP, but I assume that's not the issue
|
[15:48] sustrik
|
the problem is that IPC descriptor *has* to be HANDLE
|
[15:49] sustrik
|
hm, well
|
[15:49] sustrik
|
the I/O thread has to poll on both TCP and IPC sockets
|
[15:50] sustrik
|
zmq_poll has to poll only on the descriptors provided by mailbox_t
|
[15:50] sustrik
|
currently SOCKET but presumably a HANDLE in the future
|
[15:51] sustrik
|
could be doable...
|
[15:51] Guthur
|
yeah, a socket handle can be a file handle so it would make sense to get them all the same
|
[15:52] Guthur
|
I think IOCP seems quite neat actually
|
[15:52] Guthur
|
at bit murky at the beginning
|
[15:53] Guthur
|
one thing though...
|
[15:53] Guthur
|
I don't think one can remove a handle from an IOCP
|
[15:53] sustrik
|
the problem with IOCP is that it doesn't provide a sane pushback mechanism
|
[15:53] Guthur
|
which begs believe
|
[15:54] Guthur
|
sustrik: you mean the fact we have to use the overlapped struct to identify etc?
|
[15:54] sustrik
|
i mean the fact that you can push any amount of data to the socket
|
[15:55] sustrik
|
without being notified that the TCP buffer is full
|
[15:57] sustrik
|
there seems to be no equivalent to HWM when using IOCP
|
[15:57] Guthur
|
sustrik: there is data in the overlapped regarding the amount of data sent
|
[15:57] Guthur
|
you would probably also have to pass you HWM it make the comparison
|
[15:58] Guthur
|
that is off the top of my head, so there may be other nicer ways
|
[15:58] sustrik
|
i don't follow
|
[15:58] sustrik
|
what data in OVERLAPPED
|
[15:58] sustrik
|
?
|
[15:58] Guthur
|
http://msdn.microsoft.com/en-us/library/ms684342(v=vs.85).aspx
|
[15:58] Guthur
|
internalhigh
|
[15:59] sustrik
|
"The InternalHigh member was originally reserved for system use and its behavior may change. "
|
[15:59] Guthur
|
internal might have an error code for full buffer
|
[15:59] sustrik
|
it's some internal IOCP stuff
|
[15:59] sustrik
|
better not touch it
|
[15:59] Guthur
|
no i meant the part that says: The number of bytes transferred for the I/O request. The system sets this member if the request is completed without errors.
|
[16:00] Guthur
|
we can add are own stuff to the custom overlapped struct so that's not an issue
|
[16:01] Guthur
|
a custom overlapped could look like follows...
|
[16:01] Guthur
|
{ OVERLAPPED olp; OP_TYPE op; int HWM; }
|
[16:01] Guthur
|
damn
|
[16:01] Guthur
|
{
|
[16:01] Guthur
|
OVERLAPPED olp;
|
[16:01] Guthur
|
OP_TYPE op;
|
[16:02] Guthur
|
int HWM;
|
[16:02] Guthur
|
}
|
[16:02] Guthur
|
sorry for the spam
|
[16:02] Guthur
|
that's just a very crude example
|
[16:03] sustrik
|
hm, how would you limit the amount of pending outbound data?
|
[16:15] Guthur
|
hmm, yeah that HWM probably wouldn't help, it was more to explicitly show the custom Overlapped
|
[16:16] Guthur
|
but I think in terms of what is in epoll etc, we can get that easy enough with IOCP
|
[16:16] Guthur
|
agree?
|
[16:16] Guthur
|
I haven't looked much outside that
|
[16:17] sustrik
|
Guthur: in terms of functionality you can get the same with IOCP
|
[16:17] sustrik
|
although it requires a bit more work
|
[16:18] sustrik
|
in terms of performance, there can be problems with IOCP
|
[16:23] Guthur
|
i though IOCP was pretty performant
|
[16:23] Guthur
|
thought*
|
[16:24] sustrik
|
the problem i see is with under-filled outbound TCP buffer
|
[16:24] sustrik
|
to honour the HWM on the send side
|
[16:25] Guthur
|
how to the other method facilitate that?
|
[16:25] Guthur
|
to/do
|
[16:25] sustrik
|
hm, in theory we can count the number of bytes we've already sent to the socket and haven't seen acknowledgements for
|
[16:26] sustrik
|
yuck
|
[16:26] Guthur
|
ah yes I see, pretty yucky
|
[16:26] Guthur
|
so you get this for free the other ways?
|
[16:27] sustrik
|
yes, using select/poll/epoll etc.
|
[16:27] sustrik
|
any sane OS has a mechanism like this
|
[16:27] sustrik
|
Win32 has select
|
[16:27] sustrik
|
but undfortunately, it can be used just for SOCKETs
|
[16:27] sustrik
|
(i.e. not named pipes)
|
[16:28] sustrik
|
There's WSAPoll btw
|
[16:28] sustrik
|
but:
|
[16:28] sustrik
|
"The WSAPoll function is defined on Windows Vista and later."
|
[16:29] sustrik
|
so the alternative to IOCP would be to use WSAPoll on Vista and Win7
|
[16:29] Guthur
|
umm, XP is a large chuck of windows to not support
|
[16:29] sustrik
|
and fall back to select() on XP or somesuch
|
[16:30] sustrik
|
shrug
|
[16:30] Guthur
|
also may rule out a lot of windows server
|
[16:30] Guthur
|
not sure on win server kernel families though
|
[16:31] sustrik
|
Minimum supported server
|
[16:31] sustrik
|
Windows Server 2008
|
[16:32] Guthur
|
umm that's quite modern
|
[16:32] sustrik
|
damn, it works for SOCKETs only
|
[16:32] Guthur
|
hehe
|
[16:32] Guthur
|
red herring then
|
[16:40] sustrik
|
however, WSAPoll seems to have no limit on number of sockets it can poll on
|
[16:40] sustrik
|
select is by default limited to 64
|
[16:41] sustrik
|
so maybe, as a warm up, you could try to modify poll.hpp/poll.cpp to use WSAPoll on windows instead of poll
|
[16:41] sustrik
|
you would need vista/win7 for that, obviously
|
[16:43] sustrik
|
WSAPoll looks like pretty close copy of POSIX poll, so it should take some 1 hour to do that...
|
[16:46] Guthur
|
sustrik: and keep the old version as a fall back?
|
[16:46] sustrik
|
poll.cpp doesn't compile on windows
|
[16:46] sustrik
|
there's no poll() function there
|
[16:46] sustrik
|
so you won't break anything
|
[16:47] Guthur
|
oh, it uses select instead though, right?
|
[16:47] sustrik
|
right
|
[16:47] sustrik
|
you could force 0MQ to compile with poll
|
[16:48] sustrik
|
by defining ZMQ_FORCE_POLL macro
|
[16:48] sustrik
|
that would make it use poll.cpp instead of select.cpp
|
[16:48] sustrik
|
obviously, the build will fail now
|
[16:48] sustrik
|
but it can be presumably fixed by doing something like this:
|
[16:48] sustrik
|
#ifdef ZMQ_HAVE_WINDOWS
|
[16:48] sustrik
|
WSAPoll (...);
|
[16:48] sustrik
|
#else
|
[16:49] sustrik
|
poll (...);
|
[16:49] sustrik
|
#endif
|
[16:50] Guthur
|
ok, seems a reasonable well contained updated
|
[16:50] Guthur
|
update*
|
[16:51] Guthur
|
so would there be performance gains for ZMQ, how does the 64 socket limit effect ZMQ at the moment?
|
[16:53] sustrik
|
in MSVC build the limit is rasied to 1024
|
[16:53] sustrik
|
still, if 0mq hits the limit it fails
|
[16:54] sustrik
|
also, poll() should be more efficient with large pollsets than select
|
[16:56] sustrik
|
Guthur: wait a sec, the current implementation of poll presumes that fd is an int
|
[16:57] Guthur
|
sustrik: can you point me to the portion of polls which IOCP does supply
|
[16:57] sustrik
|
which is not true on windows
|
[16:57] Guthur
|
sorry I'm a little inexperienced with polls and sockets in general
|
[16:57] sustrik
|
so rewriting the poll wouldn't be that easy
|
[16:57] sustrik
|
anyway, what's your question?
|
[16:58] sustrik
|
polling means that you can wait for multiple sockets at once
|
[16:58] sustrik
|
you wait either for socket becoming readable or socket becoming writeavle (or both)
|
[16:58] sustrik
|
POSIX defines 2 ways of polling : select and poll
|
[16:59] sustrik
|
different unix flavours provide additional polling mechanisms:
|
[16:59] sustrik
|
epoll, /dev/poll, kqueue
|
[16:59] sustrik
|
winapi is, unfortunately, highly inconsistent
|
[16:59] Guthur
|
and there is something more than the events?
|
[17:00] sustrik
|
?
|
[17:00] sustrik
|
poll() simply exists
|
[17:00] sustrik
|
when one of the sockets is readable/writeable
|
[17:00] sustrik
|
it works in the same way as zmq_poll() does
|
[17:01] sustrik
|
exits*
|
[17:01] Guthur
|
ok, so the problem is that IOCP only notifies when an operation has completed?
|
[17:01] sustrik
|
exactly
|
[17:01] sustrik
|
it's so called AIO
|
[17:01] sustrik
|
(async I/O)
|
[17:02] sustrik
|
which is supposed to be better than standard I/O
|
[17:02] sustrik
|
howver, it's not used much
|
[17:02] private_meta
|
Heya... Small question. Is there a way the server knows when a client disconnects?
|
[17:02] sustrik
|
linux, for example, never implemented AIO for sockets
|
[17:02] sustrik
|
private_meta: no
|
[17:03] private_meta
|
so I'd have to implement some heartbeat and check if it's going through?
|
[17:04] sustrik
|
it's up to you
|
[17:04] private_meta
|
Would there be better options?
|
[17:04] sustrik
|
i personally prefer timing out the request and resending afterwards
|
[17:05] private_meta
|
Well, I would have needed to know when a connection terminates unexpectedly :/
|
[17:05] sustrik
|
what does that mean?
|
[17:06] sustrik
|
network stack has no idea about "connection termination"
|
[17:07] sustrik
|
the only way to find out whether the other party is alive
|
[17:07] sustrik
|
is to send it a ping
|
[17:07] sustrik
|
and wait for a reply
|
[17:07] private_meta
|
hmm k, thank you
|
[17:07] sustrik
|
if the reply doesn't arrive in x secs, you say the "connection is broken"
|
[17:07] private_meta
|
Apparently boost asio implemented something like that under the hood
|
[17:07] sustrik
|
quite possibly
|
[17:08] private_meta
|
Just to make sure, are timeout mechanims somehow implemented?
|
[17:09] sustrik
|
there's timeout parameter in zmq_poll() finction
|
[17:09] sustrik
|
function
|
[17:09] private_meta
|
Thank you! I'll try to figure out the rest on my own.
|
[17:20] pieterh
|
sustrik: I sent an email to the list about releases
|
[17:22] pieterh
|
private_meta: it kind of depends on the type of work you're doing
|
[17:23] pieterh
|
e.g. for pub-sub, servers don't even know clients exist
|
[17:24] pieterh
|
and 0MQ's tcp:// transport is 'disconnected' meaning nodes can go and come back invisibly
|
[17:24] private_meta
|
pieterh: I need two way communication between a server and multiple clients, and the server needs to be aware of the online status of clients
|
[17:24] pieterh
|
so you have to define what this means, "online status"
|
[17:24] pieterh
|
and then you have to explicitly send that to the server from clients
|
[17:24] pieterh
|
typically it means "alive and kicking", i.e. not frozen, not crashed, not offline
|
[17:24] private_meta
|
if the network connection between server and client is severed, the server needs to know, that's the basic thing
|
[17:25] private_meta
|
hmm
|
[17:25] pieterh
|
right
|
[17:25] pieterh
|
the other typical problems are looping application threads, CPU overload on client box, etc.
|
[17:25] pieterh
|
so a heartbeat sent by the main thread in your client is often the best thing
|
[17:25] private_meta
|
ok
|
[17:25] pieterh
|
this could be done by certain 0MQ sockets but it would not be fully reliable
|
[17:26] pieterh
|
i.e. if your main thread looped, heartbeats would still be sent out
|
[17:26] pieterh
|
also the reaction of your server to a dead client is specific to the use case
|
[17:26] pieterh
|
if you read the Guide, you'll see an example of "least recently used" routing
|
[17:26] private_meta
|
Of course. The reaction is already implemented. It's just that we need to switch the underlying server-client-infrastructure
|
[17:26] pieterh
|
it's quite easy to modify to implement heartbeats
|
[17:27] pieterh
|
I believe there are more advanced examples that actually do heartbeating
|
[17:28] pieterh
|
take a look at the peering1/3 examples
|
[17:29] pieterh
|
well, it's more complex than heartbeating but shows how to handle multiple sockets using zmq_poll
|
[17:29] pieterh
|
http://zguide.zeromq.org/chapter:all#toc50
|
[17:30] pieterh
|
mikko: there are zfl build failures from Hud^hJenkins, I've fixed that issue
|
[17:31] private_meta
|
Thanks, I'll look it up
|
[17:31] private_meta
|
*sigh* it's a pain having switch to a new library if it's not all too compatible :/
|
[17:36] sustrik
|
pieterh: thx
|
[17:36] pieterh
|
private_meta: you can most likely make a decent emulation of your old library
|
[17:36] pieterh
|
sustrik: let's see what discussion that creates...
|
[17:37] private_meta
|
pieterh: In some way that is what I want to do. Or let's say need to do.
|
[17:37] pieterh
|
what is the old library? Boost.asio?
|
[17:39] private_meta
|
Yes
|
[17:39] pieterh
|
I'd suggest making that a public project then
|
[17:39] pieterh
|
shove it on github, announce it on zeromq-dev, get others to help you
|
[17:40] private_meta
|
But apparently, when compiled with a linux MPI compiler and when using it with MPI commands, it loses messages
|
[17:40] pieterh
|
private_meta: if your only problem is bugs, that's pretty good
|
[17:41] private_meta
|
How so?
|
[17:41] pieterh
|
should be easy to solve, if it's reproducible
|
[17:41] private_meta
|
Ahahaa... yeah, that's what we thought
|
[17:42] private_meta
|
before we spent months trying to fix it
|
[17:42] pieterh
|
months? wow... ok
|
[17:42] private_meta
|
The OpenMPI project doesn't care and not a single boost or asio developer can help
|
[17:43] pieterh
|
Can you explain briefly the relationship between boost asio and MPI?
|
[17:44] pieterh
|
Also, if you get stuck on any 0MQ issue for more than a few... days... come here or to the dev list for help
|
[17:44] private_meta
|
Well, I don't know exactly what you want detailed. We use boost asio to communicate between a server and clients, while these clients are MPI programs that run parallel code.
|
[17:44] pieterh
|
i've never used MPI and have only seen boost asio from a distance
|
[17:44] pieterh
|
does the MPI API call boost asio?
|
[17:45] pieterh
|
or is the MPI part separate from the boost asio stuff?
|
[17:45] private_meta
|
Nonono, MPI and Boost asio are not connected in any way execpt for our code. We use MPI for parallelization.
|
[17:45] pieterh
|
ok...
|
[17:45] private_meta
|
But still we need communication from the parallel clients to the server(s)
|
[17:45] pieterh
|
so your client apps are doing weird multithreading via MPI
|
[17:46] pieterh
|
and at the same time trying to do sane multithreading via 0MQ at the other side
|
[17:46] pieterh
|
all in a single process
|
[17:46] private_meta
|
Somewhat. I'd rather call it multiprocessing
|
[17:46] pieterh
|
:-) to be able to help, I need to map unfamiliar stuff onto words that make sense in this universe...
|
[17:47] private_meta
|
We have a cluster with several nodes/servers, and they all need to be communicated with while they do some multicore and multinode crunching
|
[17:47] private_meta
|
hmm
|
[17:47] private_meta
|
something familiar...
|
[17:47] pieterh
|
so a client behaves correctly when it doesn't do any work, and starts to lose messages when it uses MPI...?
|
[17:47] private_meta
|
well, imagine MPI to be some sort of threading library, where the threads can communicate with each other AND can be on different computers
|
[17:47] private_meta
|
>_>
|
[17:47] pieterh
|
sure, like a primitive 0MQ
|
[17:48] private_meta
|
ok
|
[17:48] pieterh
|
nah, I'm sure MPI is great, that's not the point
|
[17:48] pieterh
|
your emulation over 0MQ works until you link clients with MPI, right?
|
[17:48] private_meta
|
so, we use this setup to execute code on different platforms, like Graphics Cards (CUDA compiler), CPUs (MPI compiler) or IBM Cell Broadband Engine (IBM compiler)
|
[17:49] pieterh
|
ack, a fairly classic setup IMO
|
[17:49] private_meta
|
Whenever it's compiled with MPI and used with MPI, several messages are lost
|
[17:49] pieterh
|
ok
|
[17:49] pieterh
|
do you have a *minimal* test case that reproduces this?
|
[17:50] private_meta
|
I'd have to ask my colleague, but he's not here right now
|
[17:50] pieterh
|
faced with this, what I'd do is:
|
[17:50] private_meta
|
As far as I know, he created some minimal test case for trying to find the bug
|
[17:50] pieterh
|
... ok, more questions
|
[17:50] pieterh
|
what 0MQ socket types are you using?
|
[17:51] private_meta
|
slow down, slow down, I'm new to 0MQ, do you mean like "TCP and IPC" or do you mean that "ZMQ_REP" stuff?
|
[17:51] pieterh
|
:-)
|
[17:52] pieterh
|
both
|
[17:52] pieterh
|
socket types means REP/REQ/PUB/SUB/etc.
|
[17:52] pieterh
|
but I was also going to ask what transports you use (presumably tcp://)
|
[17:52] private_meta
|
We want to use TCP and IPC
|
[17:53] pieterh
|
the best way to proceed (and this is for any kind of 0MQ problem you face)
|
[17:53] pieterh
|
is to make a minimal 0MQ server/client that reproduces the problem
|
[17:53] private_meta
|
I guess I still need to find a list where the REP/REQ and other 0MQ vocabulary is detailed
|
[17:53] pieterh
|
and post this somewhere we can look at it (e.g. a gist at github)
|
[17:53] private_meta
|
uhm...
|
[17:54] private_meta
|
My problem isn't with 0MQ right now I hope
|
[17:54] pieterh
|
and you do need to read the Guide (http://zguide.zeromq.org/chapter:all)
|
[17:54] private_meta
|
It's getting to simulate what I HAVE (without the error of course)
|
[17:54] pieterh
|
90% of the time its down to some error in how you use 0MQ
|
[17:54] private_meta
|
yeah, I'm doing that, I was initially coming here to ask about the socket termination issue
|
[17:55] private_meta
|
It's not like the docs are a 5 minute read :)
|
[17:55] pieterh
|
3-4 days, IMO
|
[17:55] pieterh
|
well, it took longer to write :-)
|
[17:56] pieterh
|
anyhow, first thing to do is a sanity check of your 0MQ code
|
[17:56] private_meta
|
I'm sure of that. I try to extract what's necessary, I doubt I need all the details for now (somewhat lazy approach, I know)
|
[17:56] pieterh
|
we don't care about code that works
|
[17:56] pieterh
|
so a minimal (totally stripped down) server/client that fails, that we can look at...
|
[17:56] pieterh
|
if we don't find any errors in that, we can start to blame something else
|
[17:57] private_meta
|
Well, the usual then :)
|
[17:57] pieterh
|
right
|
[17:57] pieterh
|
feel free to email me at ph@imatix.com if I'm not here when you're ready
|
[17:57] pieterh
|
or else post to the zeromq-dev list
|
[17:58] pieterh
|
I assume the problem can be reproduced without exotic hardware?
|
[18:00] private_meta
|
Somehow it feels like I'm being misunderstood. Until now there's no problem with 0MQ yet, just with Boost Asio, that's why I want to replace Asio with 0MQ, but I just started, so there are no problems, except for the learning curve :)
|
[18:04] pieterh
|
ah
|
[18:04] private_meta
|
>_<
|
[18:04] pieterh
|
see how fast we kill problems with 0MQ!
|
[18:04] pieterh
|
that took negative 30 minutes
|
[18:05] pieterh
|
of *course* boost asio is dropping messages
|
[18:05] private_meta
|
o_O
|
[18:05] private_meta
|
Well, it shouldn't
|
[18:05] pieterh
|
sorry for misunderstanding
|
[18:05] pieterh
|
presumably there is a queue overflow issue or something
|
[18:08] pieterh
|
So if you want to make a boost asio emulation layer over 0MQ, I'd recommend doing it open source
|
[18:11] private_meta
|
The amount of "emulation" we need contradicts a full open source emulation... it would be a heck of a lot of work, and I don't have time for that at work >_>
|
[18:11] private_meta
|
not that it wouldn't be a neat idea
|
[18:14] pieterh
|
The usual (sane) approach is to make strictly only what you need for your apps, release it, and allow others to expand it
|
[18:15] pieterh
|
Assuming it's possible to map the subset of boost asio you use to 0MQ
|
[18:15] private_meta
|
It would be difficult I assume
|
[18:42] Guthur
|
sustrik, returning to are discussion earlier re: IOCP, would it not be beneficial in the long run, if possible, to have both named pipes and sockets on IOCP, with aim to remove the need for Select
|
[18:53] staylor
|
I have a question about zmq sockets, are the underlying sockets maintained or opened/closed on demand?
|
[18:53] staylor
|
reason I ask is I'd like to know from my application if the client application is currently connected to the server or not, but I don't see any socket status calls in the zmq_socket api
|
[19:04] cremes
|
pieterh: does anyone with a wiki account have permission to modify the FAQ?
|
[19:04] cremes
|
pieterh: nm; just answered my own question
|
[19:13] cremes
|
just updated the FAQ to help people with the assertion in mailbox.cpp:182
|
[20:40] sejo
|
hey all I'm looking at different solutions, and basicly just need an mq to be able to use from python,
|
[20:40] sejo
|
so what would be my advantage using 0mq over rabbitmq or others?
|
[20:47] Guthur
|
sejo, I really think it depends on use case scenario
|
[20:48] Guthur
|
they are different beasts, rabbitmq is a broker base MQ, whereas ZeroMQ is brokerless, for a start
|
[20:48] Guthur
|
I wont pretend to know much about rabbitmq though
|
[20:49] Guthur
|
hehe, even my 0MQ knowledge would be on the lighter side compared to some around here
|
[20:50] sejo
|
ok, well basicly in the beginning i probably have only like 10 clients popping items, and the same 10 pushing others onto it
|
[20:51] sejo
|
as far as I understand now I should write my own protocol(s) and can use them over the clients and the servers. However is it easy to have multiple servers handling the same data?
|
[20:52] sejo
|
it probably is
|
[20:52] sejo
|
sorry stupid question
|
[20:56] sejo
|
my biggest fear is that i'll spend too much time developing on it before I can use it...
|
[20:56] sejo
|
that's why I ask around and not test them out all.. don't have the time for it
|
[20:58] Guthur
|
sure, it's sensible to do research first
|
[20:59] Guthur
|
scaling to multiple servers would something that 0MQ can do well
|
[21:01] Guthur
|
but you don't really get much in the way of 'topic' or 'queue' management out of the box, though there are PUB/SUB sockets
|
[21:01] Guthur
|
I'm reluctant to give any hard advice though, due to be lack of hardcore experience and knowledge
|
[21:02] sejo
|
thanks anyway right now I have no knowledge on what to use so
|
[21:02] Guthur
|
you could glance through the 0MQ guide
|
[21:02] sejo
|
basicly i want multiple servers and n-clients pushing and popping independently
|
[21:03] Guthur
|
http://zguide.zeromq.org/chapter:all
|
[21:03] sejo
|
i'm reading through it while we talk :p
|
[21:03] Guthur
|
ok
|
[21:03] Guthur
|
hehe cool
|
[21:03] Guthur
|
there is a few example in there that could give some inspiration for your particular problem
|
[21:04] sejo
|
yeah, main thing is that I don't need a real pub/sub, client just chooses when to pop a message
|
[21:06] Guthur
|
check out the Queue device, which would show a possible multi server pattern
|
[21:06] Guthur
|
at a very simple level
|
[21:06] Guthur
|
pieter or sustrik would be better at giving advice than me
|
[21:07] sejo
|
thk i'll chekc it out
|
[21:07] sejo
|
the thing that got me here was the nice looking python api :p
|
[21:13] Guthur
|
I'm not familiar with the python binding, but yeah I'm sure its nice, hehe
|
[21:14] Guthur
|
python has that sort of philosophy, nice simple interfaces
|
[21:15] sejo
|
we'll i'll read up on it more, the ventilation example pretty much does what i want, only i have multiple ventilators and each of them multiple types of messages
|
[21:16] sejo
|
well no probably i only need one type that works with json
|
[21:16] Guthur
|
I like JSON, nice format
|
[21:17] sejo
|
Guthur: thanks for the information, i'll read up on it a bit more and then i'll probably need to choose
|
[21:17] sejo
|
ttyal
|
[21:17] sejo
|
gtg
|
[21:17] Guthur
|
later
|
[21:17] Guthur
|
ok, drop by later and someone more experience can give better advice
|
[21:43] lt_schmidt_jr
|
is gonzalo here perchance
|
[21:46] whack
|
So, is there no way to bind to a random port? (like binding to port 0)
|
[21:47] whack
|
I'm not seeing anything obvious in the docs, and attempts to bind to tcp://blah:0 result in an error
|
[22:46] sustrik
|
lt_schmidt_jr: gonzalo doesn't come here often, you have to use email instead
|
[22:47] sustrik
|
whack: no there's no way
|
[22:53] lt_schmidt_jr
|
sustrik; we are having an impedance mismatch on our responses, thanks
|
[23:06] kdj
|
So what is the proper way to make sure that a message is sent to a polling server? Just a response?
|
[23:07] cremes
|
kdj: i don't understand the question; can you rephrase?
|
[23:09] kdj
|
Sorry. We have some clients that will occasionally send a short message to a server... but just sending won't error if the server isn't there. I understand why (I think)
|
[23:10] kdj
|
But I want to make sure the server is there
|
[23:10] cremes
|
kdj: that's correct; 0mq has no indicator that the server went away
|
[23:11] cremes
|
you should establish an "ack" that the server should send back; if it times out, the server is dead
|
[23:11] cremes
|
i recommend polling on req/rep sockets to accomplish this
|
[23:11] cremes
|
e.g. each client has its own REQ socket; the server has a XREP socket (so that it can respond to multiple clients)
|
[23:11] kdj
|
You can poll on REQ sockets?
|
[23:12] kdj
|
Yeah, that is how it is setup now
|
[23:12] cremes
|
absolutely; send/recv with ZM_NOBLOCK
|
[23:12] cremes
|
and register them with zmq_poll
|
[23:13] kdj
|
Sending with NOBLOCK isn't actually doing anything with just a normal REQ socket, but it does on receive... is that because I need polling?
|
[23:14] cremes
|
kdj: well, you don't *need* to send with noblock
|
[23:14] cremes
|
the basic idea is when your client sends the data, start a timer
|
[23:15] cremes
|
if the server responds back, cancel the timer
|
[23:15] cremes
|
if the timer expires, close the req socket
|
[23:15] cremes
|
none of that needs noblock
|
[23:15] lt_schmidt_jr
|
to jump in with kdj, when would you use in polling vs blocking
|
[23:15] cremes
|
you will need to poll if your timer and req socket are in the same thread
|
[23:16] cremes
|
lt_schmidt_jr: like so...
|
[23:16] cremes
|
if you start your timer and then call recv in blocking mode, how do you handle timer expiration?
|
[23:16] cremes
|
1. timer must live in a separate thread or process from the blocking recv
|
[23:17] lt_schmidt_jr
|
right
|
[23:17] cremes
|
2. recv is non-blocking and you use poll to handle the recv; timer is on the same thread
|
[23:17] cremes
|
those are the 2 ways i would approach
|
[23:17] cremes
|
i like #2 better
|
[23:17] kdj
|
Ok. I wasn't sending an acknowledgement from the server originally... just receiving the message and moving on
|
[23:17] cremes
|
threading gets so messy
|
[23:18] cremes
|
kdj: if you were using REQ sockets on the client, the next time you tried to send you would get a EFSM error
|
[23:18] cremes
|
REQ/REP sockets are strictly stateful; REQ *must* send/recv/send/recv while REP *must* recv/send/recv/send
|
[23:19] kdj
|
Yeah, that makes sense.
|
[23:21] lt_schmidt_jr
|
hmm, interesting, so I should be able to put multiple sockets with a poller
|
[23:21] lt_schmidt_jr
|
same poller
|
[23:23] kdj
|
Hmmm... does 0mq send an acknowledgement automatically?
|
[23:23] cremes
|
lt_schmidt_jr: yes
|
[23:23] cremes
|
kdj: no
|
[23:24] cremes
|
kdj: the heartbeat is an application-level responsibility; your code must process and send the ack
|
[23:24] cremes
|
you could actually abstract this out into your own private "heartbeat" socket and make it completely transparent
|
[23:25] kdj
|
Yeah, that totally makes sense... I just threw some code together to test it though and it (sort of) works
|
[23:26] kdj
|
having a poller on the server end, which just recieves messages (no sending), and a client which sends and then receives... somehow the receiving on the client end is still happening (and not blocking)
|
[23:26] lt_schmidt_jr
|
kdj: for me I am planning to use ZooKeeper, which I have used successfully in a similar way to figure out server presence
|
[23:27] lt_schmidt_jr
|
in my case to figure out other servers that will form a cluster
|
[23:27] cremes
|
kdj: print out the data that your client is receiving
|
[23:27] cremes
|
or run tcpdump and watch the packets fly
|
[23:27] cremes
|
unless you are issuing a zmq_send() from the server, the client shouldn't be getting a response
|
[23:27] cremes
|
there has to be code doing that somewhere in your example
|
[23:28] cremes
|
is it small enough to pastie?
|
[23:29] lt_schmidt_jr
|
kdj, cremes: you can use http://pastebin.com/
|
[23:31] kdj
|
Sorry, I think it was just my threading code for testing it. It works as it is supposed to. :X
|
[23:32] cremes
|
yeah, that's an easy mistake to make
|
[23:32] cremes
|
take a look at using the "inproc" transport for communicating between threads
|
[23:32] cremes
|
it obviates the need for mutexes and makes threading code simpler
|
[23:32] cremes
|
btw, that's one of the great wins of using 0mq; it's a threading library too!
|
[23:34] lt_schmidt_jr
|
cremes: not to ask a stupid question, but how does one use it for threading - is it in the guide?
|
[23:36] cremes
|
lt_schmidt_jr: i don't know if it's in the guide; haven't looked lately
|
[23:36] cremes
|
but here's the basic idea
|
[23:36] cremes
|
imagine you have 10 threads trying to access a shared resource
|
[23:36] lt_schmidt_jr
|
right
|
[23:36] cremes
|
right now you use a mutex, spinlock or some locking structure
|
[23:36] lt_schmidt_jr
|
ok
|
[23:37] cremes
|
with 0mq, put the resource that everyone wants into its own thread and give it a XREP socket
|
[23:37] cremes
|
now make every other thread a "client" of that "server" and give them REQ sockets
|
[23:37] cremes
|
connect them all together using inproc (all platforms) or ipc (unix only) to communicate so you don't pay the TCP penalty
|
[23:38] cremes
|
each client "asks" the resource for whatever via the 0mq socket
|
[23:38] cremes
|
the 0mq socket serializes all access to the resource and prevents all race conditions
|
[23:38] cremes
|
make sense?
|
[23:38] lt_schmidt_jr
|
I see
|
[23:38] lt_schmidt_jr
|
absolutely
|
[23:38] lt_schmidt_jr
|
thank you
|
[23:38] cremes
|
this is the basic idea behind Actors if you have played with those in any languages
|
[23:39] lt_schmidt_jr
|
I have played with erl
|
[23:39] cremes
|
lt_schmidt_jr: right; instead of using mutexes, you are using *messaging* for your concurrency
|
[23:39] cremes
|
and here's another cool part of using 0mq
|
[23:39] lt_schmidt_jr
|
cremes: very cool
|
[23:40] cremes
|
let's say at some point this "server" resource needs to be on its own box
|
[23:40] cremes
|
all you have to do to change communications is modify the transport string that you pass to zmq_connect/zmq_bind from inproc (or ipc) to tcp
|
[23:40] cremes
|
instant scaling
|
[23:40] cremes
|
i have used this technique many times already; works wonderfully
|
[23:40] lt_schmidt_jr
|
yeah, you would just change the ..
|
[23:41] lt_schmidt_jr
|
I have prototyped a pub/sub message bus and I have inproc/ipc/tcp going between different participants
|
[23:42] kdj
|
Hmmm... now I'm not really sure how our original client/server stuff was working...
|
[23:42] lt_schmidt_jr
|
but I think I am just not treating the threading correctly - too many threads
|
[23:43] cremes
|
lt_schmidt_jr: you'll have to figure that one out; i'm not a threading expert
|
[23:45] lt_schmidt_jr
|
cremes: the issue is I have a thread per connection and I still need to use polling to figure out if the thread needs to be shut down
|
[23:45] lt_schmidt_jr
|
so its a little ugly
|
[23:46] cremes
|
i don't understand, but ok
|
[23:46] lt_schmidt_jr
|
If I block on recv, I am not sure how a subscriber can be inerrrupted
|
[23:46] cremes
|
oh, i see
|
[23:47] cremes
|
are you using 2.0.10 or 2.1.0?
|
[23:47] lt_schmidt_jr
|
2.0.1 and Java
|
[23:47] lt_schmidt_jr
|
2.0.10
|
[23:47] cremes
|
um... ok
|
[23:47] lt_schmidt_jr
|
is there something in 2.1.0 that I should be using?
|
[23:47] cremes
|
i think your only solution then is to close the entire context via zmq_term()
|
[23:48] cremes
|
that will cause each socket to awaken and return ETERM
|
[23:48] cremes
|
everybody should be on 2.1.0 now; the only 2.0.10 users should be legacy guys who *cannot* upgrade for whatever reason
|
[23:48] cremes
|
so yeah, upgrade
|
[23:48] lt_schmidt_jr
|
see, I have multiple subscibers within the same context, and only one would need to be terminated
|
[23:49] cremes
|
yep, terminating the context terminates *all* sockets so that's your only choice there
|
[23:49] cremes
|
in 2.1.0 i believe you can call zmq_close() on the socket from another thread and it will work as expected
|
[23:49] lt_schmidt_jr
|
ok, I skipped 2.1.0, because it caused the java binding unit tests to fail
|
[23:49] cremes
|
yeah, 2.1.0 is considered beta so not everyone has updated their bindings
|
[23:50] lt_schmidt_jr
|
maybe I should do that myself
|
[23:50] cremes
|
but it is *way* more stable than 2.0.10 so i would upgrade
|
[23:50] cremes
|
maybe you could submit a patch to fix the java tests
|
[23:50] lt_schmidt_jr
|
I submitted the maven fix, should do this as well
|
[23:51] lt_schmidt_jr
|
so I could close the socket from a different thread, great
|
[23:52] lt_schmidt_jr
|
I guess I could figure out how to use polling correctly and not have a bunch of threads in the first place
|
[23:52] lt_schmidt_jr
|
that is have many sockets and a single polling thread
|
[23:53] lt_schmidt_jr
|
and not have the computer turn into a space heater
|
[23:53] lt_schmidt_jr
|
will go through the guide
|
[23:54] kdj
|
Thanks for your help cremes
|