Sunday November 28, 2010

[Time] NameMessage
[01:24] shaun510 mikko: got a prototype running using XREP/XREQ. It was a little trickier than expected, mainly due to leaving out the empty frame indicating the end of the addressing frames
[01:24] shaun510 otherwise, fairly painless.
[01:26] mikko are you reading the zguide?
[01:26] mikko and good news
[01:26] mikko thats quick movement
[01:29] shaun510 yeah, the zguide is pretty awesome.
[01:30] shaun510 anyway, I'm not entirely sure if the XREQ/XREP pair is the path I want to take on this. It seems like 1:1 bidirectional streaming might be a pattern common enough to formalize.
[01:32] mikko well, PAIR is 1:1
[01:33] shaun510 yeah, I saw that, but it seems like those are missing some of the nicer bits (I think the zguide mentioned something about no auto reconnection, for instance)
[01:37] shaun510 I might just need to rethink how these components are talking to each other. 1:1 streaming makes sense if you're limited to TCP connections, but maybe it would be better to use one of the other patterns.
[10:56] Steve-o looks like I'd better cut a new PGM 5.0 if you want a master release on monday
[10:57] Steve-o backport the rate limiter fix from 5.1
[14:05] Gekz hi guys.
[14:06] Gekz I'm using pyzmq, and I'm getting an error when attempting the following
[14:06] Gekz socket.connect("epgm://")
[14:06] Gekz File "socket.pyx", line 342, in zmq.core.socket.Socket.connect (zmq/core/socket.c:3304)
[14:06] Gekz zmq.core.error.ZMQError: Invalid argument
[14:08] Gekz socket.bind("pgm://") on the other server works though
[14:08] Gekz the host
[14:08] Gekz the client is spitting the error.
[14:09] Gekz it makes no difference whterh I use pgm or epgm.
[14:13] sustrik you are specifying a wrong connection string
[14:14] sustrik Gekz: check zmq_pgm(7)
[14:14] Gekz I have tried specifying an interface, such as epgm://eth0;
[14:14] Gekz it gives me the same error >_>
[14:14] sustrik what's
[14:14] sustrik there should be a multicast group there
[14:15] Gekz a hostname.
[14:15] sustrik you probably want to use TCP, not PGM
[14:15] Gekz I'm writing a server/client to distribute 70Gb from one server to approximately 100 computers.
[14:16] Gekz I'm thinking TCP wont suffice for that.
[14:16] sustrik on LAN?
[14:16] Gekz on LAN.
[14:17] Gekz oh, I totally misunderstood how multicasting works.
[14:17] Gekz I didn't realise there was an IP range specifically for it.
[14:17] sustrik you should read some introductry article first
[14:18] sustrik
[14:18] Gekz haha, I did, I just _missed_ that integral part.
[14:21] Gekz sustrik__: before I make another stupid rookie error, as I said, I will be broadcasting large amounts of data. Will using ZMQ_SNDMORE suffice, or will I need to implement my own checks?
[14:22] sustrik you want to send only a single message?
[14:22] Gekz well, I need to send it in chunks, in order, to write a file in order.
[14:23] sustrik do so then
[14:23] sustrik PUB/SUB paradigm
[14:23] Gekz what I'm asking is whether or not using ZMQ_SNDMORE will affect that
[14:23] Gekz yeah, I'm using PUB/SUB
[14:24] sustrik SNDMORE sends only a single message, the only difference is you can compose it in chunks
[14:24] sustrik thus the whole message must fit into memory etc.
[14:24] Gekz oh ok
[14:25] Gekz so the other side receives one message
[14:25] Gekz but you're concatenating multiple into one on the host
[14:25] Gekz I understand now.
[14:25] sustrik right
[14:25] Gekz do I need to do any checks to ensure the packets are received in order?
[14:25] sustrik order should be ok
[14:25] Gekz that was my basic worry.
[14:25] sustrik what you need to check is whether packets are missing
[14:26] Gekz ah.
[14:26] sustrik when network is overloaded, some of them may be dropped
[14:26] sustrik but you can set the reliability parameters for PGM
[14:26] sustrik check ZMQ_RATE, ZMQ_REOCVERY_IVL socket options
[14:27] Gekz if they're dropped, does it NAK and request a resend?
[14:27] sustrik that's happening behind the scenes automatically
[14:27] sustrik when you see packet missing
[14:27] Gekz yeah, that's what I'm asking
[14:27] sustrik what happened was that the congestion was that bad there was no way to repair
[14:28] Gekz so I should do non-blocking recv and check for errors?
[14:28] sustrik no, just number the messages and check for missing messages then
[14:28] sustrik do not ask for resend on the application level
[14:28] Gekz ok
[14:29] sustrik if message is missing the state of network is so bad that trying to recover would actually make the congestion even worse
[14:29] Gekz so I need to know how it recovers. let's say it sends 1234, and the receiver receives 124.
[14:29] sustrik ?
[14:29] Gekz I'd have implemented something that noticed that 4 is in fact not after 3
[14:30] sustrik yes
[14:30] Gekz would the third packet be resent?
[14:30] Gekz or am I simply aware of the failure
[14:30] Gekz and need to start again?
[14:30] sustrik behind the scenes
[14:30] sustrik you won't see that
[14:30] Gekz ok
[14:30] Gekz so it just drops 4, and waits to see 3?
[14:30] sustrik if packet is missing
[14:30] sustrik something went terribly wrong
[14:30] sustrik you alert the admin then
[14:30] sustrik blow the siren
[14:31] sustrik start the red alert mode
[14:31] sustrik auto-destruct
[14:31] sustrik watever
[14:31] Gekz lol
[14:31] Gekz so basically, that should never happen
[14:31] Gekz I should catch the error, abort, and work out the problem before attempting again?
[14:31] sustrik it can, but there's no way to recover from it
[14:32] sustrik for example, if the consumer is too slow
[14:32] sustrik it cannot keep with the publisher
[14:32] sustrik and finally blows up
[14:32] sustrik the problem is there's no way to solve the problem
[14:32] Gekz yep
[14:33] sustrik maybe buy better box or something
[14:33] Gekz so basically, dont publish at 1Gbit if you have 100Mbit subscribers haha
[14:33] sustrik exactly
[14:33] Gekz I'm actually trying to rig up a small application to distribute hdd images for a small job I have.
[14:34] sustrik try with TCP first
[14:34] Gekz I'm currently using novell zenworks, but that is just not feasible right now.
[14:34] sustrik multicast is a complex topic
[14:34] Gekz multicast is indeed more complex than I had expected.
[14:34] Gekz but I was told that using TCP was a no-no for this.
[14:34] sustrik you haven't heard all yet :)
[14:34] sustrik you need adequate network hardware to deal with it
[14:35] sustrik otherwise you can kill your network with it
[14:35] Gekz many procurve switches >_>
[14:35] Gekz many 1GBit switches
[14:35] sustrik also, the network has to be set up correctly
[14:35] sustrik etc.
[14:35] Gekz but yes, I'm not in control of the infrastructure (which based on my experience, is probably a good thing :P)
[14:35] Gekz but sadly, the guy in charge seems to know less
[14:35] Gekz which is a scarier though
[14:35] Gekz thought*
[14:35] sustrik anyway, start with TCP
[14:36] sustrik use multicast only if you are 100% sure the TCP won't fly
[14:36] Gekz let me pastebin the code I have for my host, I'm not entirely sure if it's right
[14:36] Gekz because when I test it with the client, it gets approximately half the packets before it just stops receiving.
[14:39] Gekz
[14:39] Gekz sustrik__: ^
[14:40] Gekz erm, except change epgm to tcp, I forgot to change it back xD
[14:42] sustrik it's ok
[14:43] sustrik keep in mind though that the subscribers may miss the beginning of the transmission
[14:43] sustrik it they are started after it have begun
[14:44] Gekz I start them first, this is just a test app
[14:45] sustrik but it takes some time to connect
[14:45] sustrik so you start the publisher
[14:45] sustrik it starts publishing
[14:46] sustrik the subscribers connect after 1/100 sec
[14:46] sustrik but they've missed some messages already
[14:46] Gekz no no
[14:46] Gekz I bind on the server, then I start the client, then I hit enter on the server to start sending
[14:46] sustrik then it should work fine
[14:46] Gekz but it doesnt :<
[14:47] sustrik then report a bug
[14:47] Gekz I'm not sure where the bug is
[14:47] Gekz because it stops printing numbers part way
[14:47] Gekz 280 281 282 28
[14:47] Gekz it never prints the 3 in 283 which confuses me.
[14:47] sustrik heh
[14:47] sustrik a deadlock in python?
[14:48] sustrik report the bug to pyzmq project IMO
[14:50] Gekz I changed how I test it
[14:50] Gekz and it worked.
[14:50] Gekz I made it send a count, instead of counting itself.
[14:51] Gekz ie, filled the buffer with 1 to 9999
[14:51] Gekz it worked.
[14:51] Gekz so it's probably just a crappy python bug
[14:52] Gekz haha that crashed my terminal
[14:52] Gekz damn you python, you're ruining my day
[14:53] Gekz haha, it's still sending packets
[14:59] Gekz anyway, thanks very much sustrik__
[14:59] Gekz you've guided my efforts well :P
[14:59] Gekz and now i shall sleep.
[16:09] mikko sustrik__:
[16:52] sustrik mikko: yes, seen it
[16:52] sustrik sorry
[18:31] magicblaze007 will there be a zeromq python installation system for windows (.msi/.exe) sometime?
[18:33] magicblaze007 seems like there used to be an msi installer -- (but the link doesnt have anything)
[18:43] Guthur magicblaze007, Probably a matter of no one being willing to maintain it
[18:43] Guthur you could ask on the mailing list and maybe volunteer
[18:44] sustrik Guthur: right
[18:44] sustrik it's just a matter of someone actually doing it :)
[18:46] Guthur mikko, The building of clrzmq2 seems to be stuck
[18:46] Guthur says that it has taken over 2.5 days so fatr
[18:46] Guthur far*
[18:47] Guthur
[18:50] mikko added a timeout
[18:50] mikko it now fails after 5 mins
[18:57] magicblaze007 Guthur: are these zeromq builds that are maintained regularly?
[18:58] magicblaze007 I guess i should be looking at this:
[18:58] mikko magicblaze007: those are daily builds for zeromq and bindings
[18:59] mikko magicblaze007: MSI installer is slightly different
[19:00] magicblaze007 all thats stopping me from using zeromq in my project is a windows msi installer. I wish i was any good in writing installers..
[19:00] Guthur magicblaze007, You could just build it...
[19:01] magicblaze007 Guthur: I tried. I guess I only have VS 2010. Didnt work
[19:01] mikko magicblaze007: why do you need msi installer?
[19:01] magicblaze007 because my users can easily install it?
[19:01] Guthur magicblaze007, build instructions are here [
[19:01] Guthur oops
[19:01] mikko why dont you ship it with your project?
[19:01] Guthur here
[19:02] magicblaze007 Guthur: for pyzmq installation, zmq should be installed...which is the problem
[19:03] Guthur It seemed to mention building libzmq from scratch
[19:03] mikko magicblaze007: you can ship the .dlls with your project
[19:03] mikko i dont think anything prevents you from doing that
[19:04] mikko libzmq.dll should be enough afaik
[19:04] magicblaze007 so pyzmq just needs one dll file?
[19:10] sustrik i think so
[19:19] andrewvc cremes: you around at all?
[19:19] andrewvc or I guess actually someone else might, know, I've got a question about zmq fds
[19:20] andrewvc basically,
[19:20] andrewvc says that you'd monitor the file descriptor you get with ZMQ_FD, and figure out what to do based on ZMQ_EVENTS
[19:21] andrewvc but, if you're using the FD with say 'select', should readable and writable be determined by select
[19:21] andrewvc why do you need ZMQ_EVENTS ?
[20:03] cremes andrewvc: i'm around for a bit...
[20:03] cremes i don't have an answer for your question ^^
[20:03] cremes i spent a *lot* of time messing around with 0mq FDs yesterday in Ruby and had very little luck treating it like a real FD
[20:04] cremes btw, jruby returns EBADF for 0mq FDs
[20:06] mikko andrewvc: because a full message might not be there
[20:06] mikko andrewvc: 0mq handles "messages" and activity on the socket doesn't mean that a message is ready
[20:15] andrewvc hi
[20:15] andrewvc ah, gotcha
[20:16] andrewvc yeah, the FDs are problematic as real FDs
[20:16] andrewvc jruby doesn't like them instantiated as IOs
[20:16] andrewvc but EM doesn't mind being passed the raw FD
[20:16] cremes interesting
[20:17] andrewvc yeah, I got issues instantiated FDs as IOs as well
[20:17] andrewvc *instantiating
[20:17] andrewvc mikko: thanks btw
[20:17] cremes yeah, i'm trying to cook up another way of confirming the FD is good in the ruby env
[20:17] cremes no luck so far
[20:22] cremes andrewvc: that might be a bug
[20:22] andrewvc cremes: thanks. Can do
[20:23] cremes andrewvc: want to collaborate on zmqmachine too? alternately, once EM goes 1.0 we might be able to get 0mq sockets into it and then i can drop the zmqmachine project
[20:24] andrewvc sure, definitely, was working on adding EM support to it today
[20:24] andrewvc had rudimentary, support kinda working
[20:24] andrewvc brb, gtg for ~ 45 min
[20:24] cremes neat!
[20:24] cremes ttyl; i'm out the remainder of the day
[21:13] mgc Hmm, doing pub/sub with epgm, when a subscriber gets too far behind it get permanently blacklisted. Is there any way to handle/recover from this other than just consuming faster and manually reconnecting?
[23:11] Guthur mgc: I assume you can set a higher ZMQ_RECOVERY_IVL
[23:12] Guthur that has memory issues though see ->
[23:19] mgc I've tried that, but no luck. I think I'm being bitten by
[23:19] mgc I only see the subscribers dropping off when they are all on the same box
[23:20] mgc when I load test between two boxes it doesnt seem to happen
[23:20] mgc so I tried delaying the subscriber connects, but no luck.
[23:24] magicblaze007 how is activemq different from zeromq?
[23:27] mgc magicblaze007:
[23:29] magicblaze007 thanks
[23:30] magicblaze007 where can i find the zeromq-dll for windows for download. Cant compile the source on windows
[23:35] mgc
[23:38] magicblaze007 thanks mgc
[23:38] magicblaze007 now does anyone know how i can use this dll file and use pyzeromq on a windows machine?
[23:40] mgc sorry :)
[23:42] mgc Guthur: ugh, nevermind, I get the dead subscriber then between two boxes as well. 4,000 msgs/s, hwm recoveryivl buffer sizes all cranked way up, after several minutes one or more subscribers just stops getting messages.
[23:42] Guthur mgc: Sorry I had tuned out there
[23:42] Guthur just reading the backlog