ZeroMq IRC Log

Tuesday January 18, 2011

[Time] Name	Message
[00:30] traviscline	i'm investigating getting the pyzmq polling integrated into gevent (cython wrapping of libevent) and wanted to ask if anyone had any 'to-reads' other than the eventlet implementation
[00:31] mikko	i think there was someone here who was integrating with gevent
[00:32] mikko	the process is very simple
[00:32] mikko	i can walk you through it if you like?
[00:34] traviscline	mikko: yeah if you have any input that'd be great
[00:35] mikko	so, first you need zeromq 2.1.x
[00:35] traviscline	rgr
[00:35] mikko	what you need to do roughly is:
[00:35] mikko	you create a zmq socket and connect/bind it
[00:35] mikko	you call getsockopt ZMQ_FD on the socket to get a filehandle
[00:36] mikko	then you add gevent io watcher on that filehandle
[00:36] mikko	and your callback should be signaled when the socket is readable/writable
[00:36] traviscline	http://www.mail-archive.com/zeromq-dev@lists.zeromq.org/msg00030.html that list posting discouraged me from taking that path
[00:36] traviscline	but thanks, I'll give that a go
[00:37] mikko	zeromq doesnt support io completion ports
[00:37] mikko	that is true
[00:38] traviscline	mikko: shit sorry, meant this http://www.mail-archive.com/zeromq-dev@lists.zeromq.org/msg06253.html
[00:39] mikko	he is probably missing a fact that zeromq is edge-triggered
[00:39] mikko	so when gevent signals that socket is readable you need to read until you get EAGAIN
[00:40] mikko	so do nonblocking recv in a loop (in the callback) and break out from loop when you get EAGAIN from recv
[00:40] mikko	edge-triggered means that the callback gets called on state changes, not necessarily on every new message
[00:43] traviscline	nods
[00:45] traviscline	mikko: !! thanks
[00:45] traviscline	works
[00:48] mikko	cool
[00:51] traviscline	mikko: straight port as of now https://gist.github.com/783810 of eventlet's stuff
[00:51] traviscline	but works
[00:51] traviscline	on initial testing
[11:02] mikko	sustrik_: im having slightly odd behavior with HWM
[11:03] mikko	i am trying to word it
[11:34] Steve-o	mikko: made advances on PGM on Windows
[11:34] Steve-o	I have CMake working again, and can even make a GUI installer pretty easily
[11:34] mikko	Steve-o_: nice!
[11:34] Steve-o	its like a few lines of CMake to build a NSIS installer package
[11:35] mikko	i wonder how windows install is best handled
[11:35] mikko	currently we got mingw and MSVC builds
[11:35] mikko	with mingw you should be able to reuse the autoconf builds
[11:35] mikko	but not sure about MSVC
[11:35] Steve-o	shared libraries and exes wont build for me, maybe because of x86/x64 difference with the compiler, with CMake that is
[11:36] Steve-o	well I've gone for autoconf for unix and cmake for windows, and scons for development
[11:36] Evet	is zeromq right tool to build a production http server?
[11:36] Steve-o	Evet: have a look at Mongrel
[11:36] Steve-o	http://mongrel2.org/home
[11:38] Steve-o	mikko: I have cygwin on Windows but not tried mingw32, I usually cross compile with that instead
[11:38] Steve-o	due to Python in Cygwin crashing on fork constantly
[11:38] Evet	Steve-o_: thanks. it looks promising. but, do you have a suggestion for c?
[11:39] mikko	mingw32 wont build
[11:39] mikko	currently it's missing group_source_req in ws2tcpip.h
[11:39] mikko	i debugged this a bit over the weekend
[11:39] Steve-o	I have a lot of patches
[11:39] Steve-o	two sets, one for mingw32 and one for mingw-w64
[11:39] mikko	it seems that mingw64 includes these upstream (based on #mingw)
[11:39] mikko	but it has not been synced to mingw32
[11:40] Steve-o	not all of them though, at last check
[11:40] mikko	which sounds slightly strange
[11:40] Steve-o	http://code.google.com/p/openpgm/source/browse/trunk/openpgm/pgm/win64/mingw-w64-bin_x86-64-linux_4.4.1-1openpgm1.diff
[11:40] Steve-o	and http://code.google.com/p/openpgm/source/browse/#svn%2Ftrunk%2Fopenpgm%2Fpgm%2Fwin
[11:40] mikko	Evet: what would you use zeromq for in http server?
[11:40] mikko	Steve-o_: yeah, i stumbled onto those over the weekend
[11:41] mikko	if you google 'struct group_source_req mingw' they are about the only results
[11:41] Steve-o	you also need to force WCACMSG header thingy too
[11:42] Steve-o	cmsghdr or _WSACMSGHDR or wsacmsghdr depending on the compiler
[11:44] Steve-o	I think that's how the Wine developers have noted my project, I've seen it mentioned a few times in their bug tracker
[11:45] Steve-o	Evet: are you after a bespoke zmq-http forwarder (router/gateway)?
[11:45] Steve-o	Evet: its certainly fast enough, but it all depends what you requirements are beyond or aside to Mongrel
[11:46] Evet	Steve-o_: i need to embed to my c application
[11:47] Steve-o	mikko: the mingw-w64 team still haven't released a version yet though have they? that's why I picked one random version and stuck with it
[11:47] Evet	currently using libevent's http module. but zeromq looks cleaner
[11:48] Steve-o	ok
[11:48] Steve-o	I've been looking at using libevent or libcurl to implement a basic integration of HTTP and 0MQ
[11:49] Steve-o	but you also have Boost which has a lot of scalability features already in for HTTP multi-core usage
[11:49] Steve-o	is this a high or load load HTTP server though?
[11:51] Evet	Steve-o_: it needs to handle high loads
[11:51] Evet	Boost's ASIO module looks great, but i dont really know about c++'s oop thing
[11:52] Steve-o	otherwise integrating with Nginx might be more likely
[11:54] mikko	Evet: you can't really communicate with clients over zeromq
[11:54] mikko	Evet: i've created libevent http based webserver but it uses zeromq for inter-thread communication
[11:55] Evet	hmm
[11:56] Steve-o	I wrote my own HTTP admin interface for PGM using async-io, but its definitely not multi-core IO scalable
[11:56] Steve-o	they are two very large different domains
[11:57] Steve-o	for basic C you can also use libsoup
[11:58] Steve-o	but application and HTTP server don't really go together, you should use a dedicated http-zmq gateway and a zmq-based application server for the core logic
[12:03] mikko	Steve-o_: i guess in which case you could just use mongrel2
[12:04] Evet	in fact, a request-reply tcp server is sufficient for me
[12:04] Steve-o	correct, but the question comes back to what is high load
[12:04] Steve-o	thousands of requests per second?
[12:05] Evet	8k requests/second per cpu core
[12:06] Steve-o	then typical design you would have basic edge gateways managing the HTTP requests and core application servers processing ZMQ messages at high speed
[12:08] Evet	i can implement RFC rules
[12:08] Evet	i have wrote some core modules for nginx, but its overcomplicated for a single application
[12:11] mikko	does it have to be http?
[12:11] Steve-o	in comparison Wikipedia is up to 90k requests per second http://www.nedworks.org/~mark/reqstats//reqstats-weekly.png for hundreds of servers (300+)
[12:12] mikko	they run tons of squid servers iirc
[12:13] Evet	mikko: no, i can handle http parsing
[12:17] Steve-o	is this with an additional load balancer in front?
[12:19] Steve-o	the point being in regular HTTP traffic 8k/s is quite high for even one machine
[12:20] Evet	a quad-core desktop pc can handle ~30k non-keepalive requests per second
[12:22] Steve-o	it all depends what you are serving though
[12:22] Steve-o	which is why it's rather difficult to help you out
[12:22] Evet	dynamic content through embedded caching
[12:23] Evet	i have reached 90k req/sec with keepalive with an in-memory hashtable library
[12:24] Steve-o	ok, so basically a higher protocol memcached? closer to amazon s3?
[12:25] Evet	not really
[12:25] Steve-o	:D
[12:25] Evet	an embedded database library without ACID overhead
[12:26] mikko	in-memory hashtable that has ACID ?
[12:26] Evet	ofcourse not :)
[12:26] mikko	memcached really is nothing more than a distributed hashtable
[12:26] Steve-o	amazon simpledb then?
[12:26] mikko	distributed in the very lose definition of the term
[12:27] Evet	im going to use zeromq for brokerless replication
[12:28] Evet	tokyo cabinet as embedded database library, which also able to append in-memory hashtable to disk
[12:28] mikko	you should look into kyoto cabinet as well
[12:29] Evet	have been using nginx, but its overcomplicated
[12:29] mikko	i'm testing kyoto cabinet in current project
[12:29] Evet	mikko: really? im testing kyoto cabinet for months too. nice to meet another kyoto* user
[12:30] Steve-o	didn't Oracle release their replication transport recently
[12:30] mikko	i remember tokyo cabinet hash database is O(log n) for retrieval?
[12:30] Evet	mikko: nope. o(1)
[12:31] mikko	Evet: but with hash database you have collisions
[12:31] mikko	i don't see how you handle collision in O(1)
[12:32] Evet	mikko: im generating uuid. but, is it what you asked?
[12:32] mikko	no
[12:33] Evet	could you rephrase then, im not good at english
[12:33] mikko	if you don't know all the keys beforehand it's impossible to create perfect hash function
[12:34] mikko	http://en.wikipedia.org/wiki/Hash_table#Collision_resolution
[12:35] mikko	Steve-o_: have you had issues with zmq_poll ?
[12:36] mikko	im seeing weird behavior that i have reached HWM but the socket has revents ZMQ_POLLOUT
[12:36] Steve-o	haven't used it yet
[12:37] mikko	and i seem to be losing messages somewhere
[12:37] mikko	need to debug further to see whether it's actually my software causing this
[12:37] mikko	anyway, lunch time. bbl
[12:39] Evet	mikko: http://translate.google.com.tr/translate?hl=en&sl=ja&u=http://fallabs.com/mikio/tech/promenade.cgi%3Fid%3D42&ei=gYk1TcX9JM3GswbroYW0Cg&sa=X&oi=translate&ct=result&resnum=1&ved=0CBcQ7gEwAA&prev=/search%3Fq%3Dsite:fallabs.com%2B%25E8%25A1%259D%25E7%25AA%2581%26num%3D100%26hl%3Den%26safe%3Doff%26qscrl%3D1%26prmd%3Divns
[12:39] Evet	hmm
[12:49] Evet	Steve-o_: so in sum of; is zeromq suitable to write an asynchronous request-respond tcp server?
[13:47] zchrish	I have a realtime server connected and am sending packets at least once per second. But my subscriber is receiving them only every few seconds. I assume this is due to the NAGLE algorithm; could this be the case?
[13:47] sustrik	zchrish: nagle is turned off
[13:48] sustrik	0mq should definitely not behave that way
[13:48] sustrik	do you have a minimal test case?
[13:48] zchrish	I see. I am sure it is somewhere else then. Thank you.
[13:49] sustrik	np
[13:49] zchrish	Actually I am converting my server code over to zeromq and probably didn't activate my heartbeat.
[14:29] ptrb	there should be no problem dynamically connecting and disconnecting an active ZMQ_SUB socket to various ZMQ_PUB sockets, right?
[14:30] ptrb	hmm, except there is no disconnect :)
[14:58] CIA-21	zeromq2: 03Martin Sustrik 07master * r56bdba5 10/ (8 files):
[14:58] CIA-21	zeromq2: Fix cppcheck warnings: Prefer prefix ++/-- operators for non-primitive types.
[14:58] CIA-21	zeromq2: Signed-off-by: Martin Sustrik <sustrik@250bpm.com> - http://bit.ly/eksLeo
[14:59] mikko	sustrik_: i'm seeing something weird
[14:59] sustrik	yes?
[14:59] mikko	sustrik_: effectively what i am trying to do is a device that stores messages if HWM is reached
[14:59] mikko	PULL/PUSH sockets over tcp
[14:59] sustrik	right
[15:00] mikko	my hwm on the PUSH socket is set to 5 for testing
[15:00] mikko	i connect producer to pull socket
[15:00] mikko	send 100 messages and i can see five being within the zeromq buffer and 95 go to the persistent storage
[15:00] mikko	now i connect a consumer that consumes five messages
[15:01] mikko	well, i bind a consumer and the device connects to it
[15:01] mikko	the consumer process exits after consuming five messages
[15:01] mikko	so my assumption was that 5 messages should now go to socket, it would hit hwm and return EAGAIN
[15:02] mikko	but what i see:
[15:02] mikko	the out_socket is constantly signaling ZMQ_POLLOUT
[15:02] mikko	and it keeps accepting messages
[15:02] sustrik	that's because the messages are stored in TCP buffers at that moment
[15:02] mikko	until my persistent store is empty and turn off polling on outsocket
[15:03] sustrik	so 0mq's queue is empty
[15:03] mikko	then, i connect the consumer again
[15:03] mikko	which is blocked on recv and no messages coming
[15:08] mikko	is that the expected behavior?
[15:22] sustrik	i think so
[15:22] sustrik	the messages are stored in TCP buffers
[15:22] sustrik	thus 0MQ buffers are empty
[15:23] sustrik	you can set the size of TCP buffers using ZMQ_SNDBUF/ZMQ_RCVBUF
[15:25] mikko	i'll give that a go
[15:38] mikko	sustrik_: i set the SNDBUF to 10 on the PUSH socket and still seeing the same behavior
[15:38] mikko	it could be just something silly im doing as well
[15:39] sustrik	i think the OS it not guaranteed to limit the TCP buffer to the value you supply
[15:39] sustrik	it's more of a hint
[15:39] mikko	https://gist.github.com/92d9ef10d280c2ccf2f0
[15:39] mikko	DataStore is the implementation or persistent storage
[15:39] mikko	hmm
[15:39] sustrik	10-byte TCP buffer seems strange
[15:39] mikko	ill test with larger messages in a mit
[15:39] mikko	min*
[15:40] mikko	hmm
[15:41] mikko	10KB messages i still lose some messages
[15:41] mikko	but not as many
[15:41] sustrik	lose?
[15:41] mikko	consume 5 messages and lose 10 - 20 messages in between
[15:41] mikko	the consumer never receives them
[15:42] mikko	i got sequence in each message
[15:42] sustrik	that looks like a bug
[15:42] mikko	when i consume the first five i get 0 - 4
[15:42] mikko	next time i might get 23 onwards
[15:43] sustrik	is there 1 connection involved?
[15:43] mikko	yes
[15:43] sustrik	or 2 of them?
[15:43] mikko	1
[15:43] sustrik	hm
[15:43] mikko	odd thing:
[15:43] sustrik	do you restart either peer?
[15:43] mikko	yes
[15:43] mikko	the consumer is a script
[15:43] mikko	it consumes 5 and exits
[15:44] sustrik	then there's another connection created?
[15:45] mikko	yes
[15:45] mikko	consume 5 at a time
[15:45] sustrik	i see
[15:45] sustrik	the messages are presumably dispatched to the old connection
[15:45] sustrik	and are dropped when the application exits
[15:46] sustrik	thus you see gaps in the sequence
[15:47] mikko	is there any merit in XPUSH/XPULL sockets where the communication is two way, a bit like XPUB/XSUB forwarding but rather for a small ACK that the message has been received
[15:48] mikko	i think a script that consumes five messages or so would not be a unique use-case for scripting languages
[15:48] mikko	as the processes are often short-lived
[15:53] mikko	and with load-balancing it's very hard to rely on seq
[15:56] guido_g	or retrieve and re-dispatch the messages when a new connection is established
[15:57] mikko	are the messages currently dropped after zeromq?
[16:00] guido_g	as far as i understood what sustrik_ said, they're dropped when the connection closes
[16:02] mikko	i understood that they are already in the network buffer (out of reach for zeromq)
[16:03] guido_g	yes, this is the main part of the problem
[16:04] guido_g	at least for small messages
[16:06] guido_g	it seems the meta-pattern for Ã¸mq is that you always need at least one other socket to manage the one you care about
[16:07] mikko	hmm
[16:08] sustrik	to get precise hwm, duplicit ack mechanism can be implemented on 0mq level
[16:09] mikko	i see some merit to that
[16:09] sustrik	thus, TCP would ack packets, whereas 0mq would ack messages
[16:10] mikko	sustrik_: im trying to create a device which would allow replaying streams
[16:11] sustrik	what does that mean exactly?
[16:11] mikko	well, guys i know are looking at kafka and it provides a mechanism to replay N minutes of rstream
[16:11] mikko	so a persistent storage is involved there
[16:12] mikko	what i was planning is to store messages and push them out to normal push socket and have separate XREP socket where you can ask for "deltas"
[16:12] mikko	so if a consumer needs N amount of data to be productive you could request last 1000 messages or so
[16:12] mikko	and then start consuming the pull feed
[16:13] sustrik	how does that work with PUSH socket?
[16:13] sustrik	shouldn't it be PUB?
[16:13] guido_g	something like ZMQ_RECOVERY_IVL?
[16:13] mikko	it could be PUB as well but with PUB i have no information when HWM is reached
[16:14] mikko	i started by creatign a device which writes to store when hwm is reached
[16:14] guido_g	hwm is per connected pull, right?
[16:14] mikko	yes
[16:14] guido_g	so there is no overall hwm on the push side
[16:14] mikko	no
[16:15] guido_g	then i cant figure out why you need hwm here
[16:15] mikko	guido_g: the behavior for PUB is to drop messages when there are no consumers
[16:16] guido_g	ack
[16:16] Evet	mikko: are you going to use kyoto cabinet as cache server?
[16:16] mikko	guido_g: by writing to a PUB socket i don't really know if consumer has got it
[16:16] guido_g	mikko: i know
[16:16] guido_g	but you need a per client sequence management anyway
[16:17] mikko	guido_g: i dont need all clients receiving all messages
[16:17] sustrik	but what you want to do is to distribute the messages to all consumers, not load balanace them among consumers, right?
[16:17] mikko	guido_g: im not maybe explaining this well
[16:17] guido_g	mikko: and don't follow you well :)
[16:17] sustrik	maybe explain the use case
[16:18] mikko	so, what i am mixing up here is the end-system and what i have now. in the end-system i will have two kinds of consumers
[16:18] mikko	consumer A is consuming from PUSH socket and receives every Nth message based on load-balancing
[16:19] mikko	and consumer B which might want messages from last 20 minutes
[16:19] mikko	the B would be XREP/XREQ i guess
[16:19] guido_g	right
[16:19] sustrik	so B only wants a log of message
[16:20] sustrik	some of those are already processed etc.
[16:20] guido_g	and the A type is not allowed to loose messages if one client crashes
[16:20] mikko	guido_g: yes
[16:20] mikko	sustrik_: yes
[16:20] guido_g	ah ok
[16:20] mikko	i need to be sure that each message is processed at least once
[16:20] mikko	and some _might_ be processed multiple times
[16:20] sustrik	mikko: there's no way to solve that
[16:21] guido_g	except w/ a control socket per client
[16:21] sustrik	it's the classis "guaranteed delivery" problem
[16:21] sustrik	when failure happens there are always some messages in "dubious" state
[16:22] mikko	sustrik_: i am fine with that
[16:22] mikko	sustrik_: but in the current situation i lose messages in "normal" operation
[16:22] sustrik	i see
[16:22] mikko	a script connecting, consuming 5 and exiting
[16:22] mikko	another script connecting, consuming 10 and exiting
[16:22] mikko	i lose large amount of messages there
[16:23] mikko	by lose i mean i have no visibility where they have gone
[16:23] mikko	from my device point of view i have sent them and from consumer point of view nothing has been sent
[16:24] sustrik	right, the only solution is to implement acks at 0mq level
[16:24] mikko	if the consumer sent back "got it in 0mq, thanks" it would be enough for normal operation
[16:24] mikko	yes
[16:24] mikko	i don't think this is required for all use-cases but certainly it seems useful for scripting languages
[16:25] stimpie	Iam running an experiment where several threads send messages to one other thread using tcp. All sending threads have setHWM(2). When I suspend the receiving thread the others keep sending.
[16:25] guido_g	also for the classic butterfly pattern
[16:26] stimpie	If I resume the receiving thread after more then 100 sent messages it receives them all.
[16:26] sustrik	mikko, guido_g: yes, it's specific to push/pull pattern
[16:26] stimpie	I was expecting only 2 messages would be queued
[16:26] sustrik	stimpie: you have to use latest version of 0mq from github and set HWM on both sending and receiving side
[16:27] sustrik	mikko, guido_g: i see two options here
[16:27] sustrik	1. standard acks
[16:27] guido_g	i'm atm writing something for this in python, a thing that acts as start and endpoint of a butterfly like systems
[16:28] sustrik	the obvious problem with acks is that if the peer exits without acking all the messages dispatched to it, those have to be rescheduled
[16:28] sustrik	to another peer
[16:28] sustrik	thus ordering is not preserved
[16:28] guido_g	right, so one need to keep a backlog
[16:29] guido_g	right
[16:29] sustrik	for example, peer may get messages 1,2,3,7,8,9,4,5,6
[16:29] sustrik	another option would be implementing an explicit shutdown handshake
[16:29] guido_g	but you can't preserve order on a push -> multiple pull system anyway
[16:30] sustrik	client says "i am about to exit"
[16:30] guido_g	ahh like shutdown(1)
[16:30] sustrik	then it consumes all remaining messages
[16:30] sustrik	then it exits
[16:30] sustrik	there's no re-ordering problem there
[16:30] mikko	so like linger for incoming
[16:30] guido_g	nice, but wouldn't help much in case if failure
[16:30] guido_g	*of failure
[16:30] sustrik	but, otoh, the shudown sequence may hang up
[16:31] sustrik	yes, same as linger, just in opposite direction
[16:31] sustrik	yes, it would work only for orderly shutdown
[16:31] sustrik	however, reliable delivery in case of failure is a myth
[16:32] sustrik	the problem can be mitigated, but never solved
[16:33] guido_g	right
[16:34] guido_g	but the mitigation would/does help a lot in most cases
[16:34] mikko	i am inclined to say that ACK makes it a bit more resilient. maybe we can document that the trade-off is that delivery order will not be guaranteed
[16:35] sustrik	yes, possibly
[16:35] mikko	also there is a throughput trade-off as well
[16:35] mikko	but there always is
[16:35] sustrik	actually, with butterfly pattern there's no overall ordering guaranteed anyway
[16:36] guido_g	as i said
[16:36] sustrik	if you have at least 2 workers
[16:36] guido_g	hehe
[16:36] sustrik	they can process messages at different speeds
[16:36] sustrik	and thus mix the stream
[16:36] sustrik	when it gets joined in the next step
[16:37] sustrik	mikko: i don't think there's much of throughput impact
[16:37] sustrik	the acks can be sent oportunistically
[16:37] sustrik	just once in a while
[16:37] sustrik	thus having close to zero performance impact
[16:40] mikko	hmm, that sounds pretty good
[16:41] mikko	is the same infrastructure that is used in subscription forwarding suitable for this?
[16:41] sustrik	the nice thing is that at most 1 message would be lost even in the case of failure
[16:42] sustrik	the one that was being processed at the moment
[16:42] sustrik	mikko: not really
[16:42] sustrik	different functionality is needed
[16:43] sustrik	the socket would have to keep list of sent but unacked messages
[16:43] sustrik	trim it when ack is received
[16:43] sustrik	and resend the messages in case of connection failure
[16:44] sustrik	there are some strange corner cases involved
[16:44] sustrik	say, what if connection fails and there's no other connection to resend the messages?
[16:45] mikko	buffer the messages and honor HWM?
[16:45] sustrik	dunno
[16:45] sustrik	i'm just thinking aloud
[16:46] mikko	there also might be duplicate delivery
[16:46] mikko	i guess
[16:46] sustrik	yes
[16:46] mikko	unless you ACK the ACK
[16:46] mikko	then you need to ACK the ACK ACK with ACK
[16:47] sustrik	as i said
[16:47] sustrik	it's unsiolvable problem
[16:47] sustrik	you can mitigate it ba adding more acks
[16:47] mikko	but if we optimize for normal operation
[16:47] sustrik	and ackacks etc;
[16:48] mikko	i think ACK is good enough for majority of the cases
[16:48] sustrik	yes
[16:50] traviscline	if there are any geventers: https://github.com/traviscline/gevent-zeromq
[16:51] traviscline	mikko: thanks again for the input, going to get a little perf bench set up and cythonify it
[18:30] lechon	hello, is anyone having problems with the lua binding?
[18:31] lechon	i just installed a fresh zeromq, lua, and lua-zmq bindings: http://codepad.org/W55ZpKIh
[18:32] lechon	zmq.DOWNSTREAM is nil?
[18:35] guido_g	ouch
[18:35] guido_g	DOWNSTRWAM is old
[18:35] guido_g	very old
[18:35] guido_g	so i guess the bindings are not up to date
[18:36] lechon	ohh
[18:36] lechon	i see PUSH and PULL work
[18:37] lechon	whoops
[18:37] guido_g	case closed :)
[19:55] ngerakines	hey folks
[19:55] ngerakines	I've got a question about connection timeouts
[19:55] ngerakines	anyone around?
[19:59] traviscline	ngerakines: general irc etiquette is to just ask, don't ask to ask
[19:59] ngerakines	fair enough
[20:23] cremes	ngerakines: so what's your questino?
[20:28] lechon	is any kind of unreliable transport like udp supported?
[20:29] cremes	lechon: not right now but new transports can be added
[20:29] cremes	search the mailing list for earlier discussions
[20:29] cremes	i think the main devs want to clean up that api a bit to make this easier; having someone actually work on adding UDP
[20:30] cremes	would be a great exercise for doing that cleanup
[20:30] mikko	udp is slightly problematic for certain semantics
[20:31] lechon	is there some kind of state overhead that needs to be reliable?
[20:32] mikko	lechon: well, for example it's guaranteed that you will only receive full messages
[20:32] mikko	sending larger messages over udp means that all packets must arrive
[20:32] mikko	if you lose a packet in the middle you need retransmission
[20:33] lechon	handling it the naive way would just be inefficient
[20:33] lechon	yet probably acceptable for applications electing to use udp (where perhaps a single dropped packet renders the entire message useless)
[20:36] mikko	there hasn't been that much talk about UDP to be fair
[20:36] mikko	i remember there has been discussion about SCTP
[20:36] mikko	and a few others
[20:38] mikko	i remember UDT being mentioned at some point
[20:39] lechon	i was able to find these two on the mailing list archive: http://lists.zeromq.org/pipermail/zeromq-dev/2010-January/001910.html, http://lists.zeromq.org/pipermail/zeromq-dev/2010-January/001700.html
[20:40] mikko	yes, you can use openpgm over udp
[20:40] mikko	but that's not strictly UDP semantics
[20:41] lechon	i'm trying to stream over the internet and can cope with lost packets so unreliable would be preferable
[20:53] ngerakines	Sorry, got pulled into a meeting
[20:54] ngerakines	Is there any further documentation on handling connection timouts?
[20:55] mikko	ngerakines: no, not really
[20:55] ngerakines	bummer
[20:55] mikko	ngerakines: what sort of situation?
[20:55] mikko	ngerakines: in most cases you shouldn't care about connection timeouts etc
[20:55] mikko	as zeromq takes care of reconnecting under the hood
[20:56] ngerakines	I've got a small client executable that creates a request to a daemon that may or may not be up, but when it isn't, the app hangs until a connection can be established
[20:56] mikko	you can use zmq_poll
[20:56] ngerakines	with my understand of things, the next thing to do would be use a while loop and zmq_poll ... yeah
[20:57] mikko	or you can do non blocking send
[20:57] ngerakines	reference?
[20:57] mikko	depends on what you want to do if the daemon is not up
[20:57] mikko	wait or just forget about it
[20:59] ngerakines	just forget about it is preferable
[21:00] mikko	ok, then you can use a non-blocking send
[21:00] mikko	which language are you using?
[21:00] ngerakines	this executable is called hundreds of thousands to millions of times a day
[21:00] ngerakines	c++
[21:00] ngerakines	so when it hangs, it can cause system/resource issues
[21:00] mikko	pass ZMQ_NOBLOCK as second arg to ->send ()
[21:00] ngerakines	ok
[21:00] ngerakines	thanks much!
[21:00] mikko	it will return false if the message was not sent
[21:00] mikko	and errno will be set to EAGAIN
[21:01] ngerakines	great, i'll readup on it as well
[22:03] lechon	mikko, from browsing the source a i get the feeling that udp versions of tcp_connecter and tcp_listener would need to be written
[22:09] mikko	and possibly tcp_socket
[22:09] mikko	not sure if that is identical
[22:10] mikko	the problematic thing with udp is that you might connect to let's say 10 endpoints
[22:10] mikko	and if 8 of them go down how do you know about it?
[22:11] mikko	somehow i think in the context of zeromq tcp makes a lot more sense
[22:11] lechon	hmm
[22:11] mikko	like for example PUSH socket will load-balance between connections
[22:12] mikko	and stops dispatching messages to peers that fail
[22:12] mikko	with udp this semantic doesn't really work
[22:12] mikko	unless you do explicit ACKs from the consumers
[22:15] lechon	i see what you mean
[22:15] cremes	i think for udp you would just say this is transport-specific behavior
[22:15] cremes	delivery is best effort
[22:16] cremes	if you start adding acks, you are duplicating tcp (and probably poorly)
[22:16] mikko	i don't see whether there is that much merit to udp in message oriented communications
[22:16] lechon	yeah. when you elect to use udp you probably have some out-of-channel way to determine when to stop sending to a particular host
[22:16] lechon	its not zeromq's concern
[22:16] cremes	right
[22:17] cremes	as for keeping message delivery atomic, i think that would be transport-specific too
[22:17] cremes	if a message part gets lost in delivery, drop the whole message
[22:17] lechon	yep
[22:18] cremes	udp packets can be up to 64k in length, right?
[22:19] lechon	yes
[22:19] lechon	including header
[22:19] cremes	perhaps the udp transport could just coalesce all parts into one packet before sending, kind of like the nagle algo
[22:20] lechon	what do you mean "all parts"?
[22:20] cremes	and then set a max of 64k (minus headers) for messages using that transport
[22:20] cremes	are you aware of the RCV_MORE and SND_MORE flags?
[22:20] mikko	cremes: assuming the total size is < 64k
[22:20] lechon	no :/
[22:20] cremes	mikko: yes
[22:21] cremes	lechon: check those out; they let you logically split up a message into parts for 0mq to deliver as an atomic whole
[22:21] cremes	i.e. message-oriented streaming
[22:21] mikko	the thing i am wondering is whether you actually need zeromq for udp? most of the functionality will be specifc to udp
[22:22] cremes	0mq still provides some neat abstractions for it
[22:22] mikko	effectively you just need to frame the messages and you are about at the same point as you are with zeromq + udp
[22:22] lechon	mikko, the zeromq interface is nice :]
[22:22] cremes	though it's less useful for udp than other protocols
[22:24] lechon	yes, coalescing all of the parts of those split messages would make sense
[22:26] lechon	the receiving end would need to drop the entire message if it all of the parts couldn't fit into one udp packet and one got lost
[22:28] lechon	when sending multipart messages like that does 0mq have a protocol for communicating the number of parts the receiver should expect?
[22:30] mikko	looks like the udp transport earlier was epgm
[22:30] mikko	looking at changelog
[22:30] mikko	so there never was a real UDP transport
[22:33] lechon	looks like it is used in resolve_nic_name for solaris/aix/hpux :P
[22:44] lechon	or is the number of parts usually encoded in the tcp packets?
[22:46] mikko	lechon: sorry?
[22:49] lechon	i was referring to my previous question about how multipart messages are delivered with the existing transports
[22:50] traviscline	ngerakines: general irc etiquette is to just ask, don't ask to ask
[22:50] traviscline	ngerakines: hey sorry, accidentlly hit up-enter
[22:50] ngerakines	np
[22:53] mikko	lechon: the number of parts doesn't need to be known beforehand
[22:53] mikko	on the sender size ZMQ_SNDMORE flag indicates that the next part will be a part of multipart message
[22:54] mikko	a message after one or more ZMQ_SNDMORE sends terminates the multipart message
[22:54] mikko	on the receivers size there ZMQ_RCVMORE which indicates whether more parts are coming
[22:56] lechon	it is up to the user to make sure that the receiver calls RCVMORE the same number of times that the sender calls SNDMORE?
[22:56] mikko	well, if the user wants to be aware of multipart messages
[22:56] mikko	you could just receive a message at a time without having to acknowledge that it's actually one multipart message
[22:57] lechon	that should work fine with udp and coalesced packets
[22:58] cremes	lechon: yes, you loop on receiving msg parts until getsockopt(RCVMORE) becomes false
[22:59] mikko	lechon: how do you maintain low latency in cases where you dont have constant throughput?
[22:59] mikko	lets say user pushes 2K to the buffer. would there be a timer that waits for more?
[23:00] mikko	well, a timer that triggers after certain period if no more messages come?
[23:00] lechon	hmm, thats an interesting problem
[23:00] lechon	it would have to be a user defined timer, and that might (?) be a messy interface change
[23:01] mikko	probably not messy
[23:01] mikko	sockopt that says the timeout
[23:01] mikko	but any kind of timeout would probably introduce latency in environment where you are not constantly pushing message
[23:01] mikko	s
[23:02] lechon	for asynchronous sends it would be kind of easy. messages could be coalesced on the queue.
[23:04] lechon	in many cases the user would probably not want additional latency... udp was selected for a reason
[23:04] mikko	all sends are asyncronous in a way when dealing with zeromq
[23:04] mikko	you are pushing the message to io thread rather than actually dealing with a socket
[23:05] lechon	so sock:send(msg) just queues msg up and returns immediately?
[23:05] mikko	lechon: there are different socket options affecting the behavior
[23:05] mikko	and there are different behaviors with different socket types
[23:05] mikko	in some cases it might block
[23:06] lechon	ok. sorry, i haven't really used 0mq yet
[23:06] mikko	what is the use-case you are looking udp for?
[23:07] mikko	also, it might possibly be beneficial to look into UDT
[23:07] lechon	given that, it might make sense to only attempt packet coalescing when in "async mode" and things can be grouped on the internal queue
[23:07] mikko	as it seems to provide some semantics that are common for zeromq sockets
[23:07] lechon	reliable, yuck :P
[23:08] lechon	my use-case is streaming video over internet
[23:08] mikko	a certain amount of reliability on transport layer makes sense with message oriented approach in my opinion
[23:08] mikko	as you are not really dealing with packets or streams but rather with a concept of message
[23:11] lechon	streams are just a sequence of messages, but i get your point
[23:11] mikko	as in i assume if you stream video you are not actually dealing with messages but rather packets. and the next packet doesnt really depend on the previous packet
[23:12] mikko	so in case of losing packet here or there the sound might jump a bit or so
[23:12] mikko	but in case of a concept of message losing a packet means that your whole message is void
[23:12] mikko	brb
[23:13] cremes	i would only suggest msg part coalescing into 1 udp packet when someone is sending multipart messages
[23:14] cremes	otherwise, just send them asap
[23:14] cremes	i don't see why you would ever want a timer for sending udp
[23:15] lechon	how do you know they are finished sending more?
[23:15] cremes	lechon: how does who know? the receiver?
[23:16] cremes	the sender knows because he doesn't pass the SNMORE flag to send
[23:16] cremes	the receiver knows where the multipart msg ends because getsockopt(RCVMORE) returns false
[23:17] cremes	lechon: you should really read the docs; a lot of this will get cleared up once you understand 0mq a bit better
[23:17] lechon	there might be a large delay between the last send(SNDMORE) and the following send()
[23:18] cremes	in that case, the i/o thread should keep the message parts in the queue until the last part is sent
[23:18] cremes	0mq is a message queue after all!
[23:18] lechon	heh right, but the timeout could be used to expedite that, so things dont hang round in the queue for too long
[23:19] cremes	then that breaks my suggestion for all msg parts to be coalesced into a single packet for udp transport
[23:19] cremes	it also makes it harder on the receiving application; now it has to deal with fragmented messages
[23:20] cremes	recall that udp doesn't guarantee order, so parts could also show up in a random order
[23:20] cremes	i think forcing msg part coalescing to one packet for udp is a pretty good idea otherwise you break a lot of 0mq guarantees that other transports ge
[23:20] cremes	s/ge/get
[23:21] cremes	in that case, no timer is desired
[23:21] lechon	even if you try to coalesce all parts into a single packet, you might not be able to... the sum of all the packets may be > 64k
[23:21] lechon	all the parts*
[23:22] cremes	true; so the udp transport would need to specify that 64k (minus headers) is the max message size for that transport
[23:22] lechon	ok, sure
[23:22] cremes	we've now come full circle; udp isn't a good choice for transport in 0mq
[23:23] cremes	it will likely gain other "special" cases which will make it less suitable
[23:23] mikko	23:14 < cremes> i don't see why you would ever want a timer for sending udp
[23:23] mikko	there might be processing in creating the parts
[23:23] cremes	one of the great things about 0mq is being able to change transports without modifying your code
[23:23] mikko	1. send part with ZMQ_SNDMORE 2. process for 10 seconds 3. send the last part
[23:23] cremes	mikko: tell me how the receiving end should handle missing msg parts
[23:24] mikko	cremes: there are no missing parts there
[23:24] mikko	cremes: but your problem is the sender
[23:24] lechon	if udp drops one part
[23:24] cremes	ok, then we're talking about different things
[23:24] mikko	that would destroy the sequence
[23:24] cremes	why do you want a timer?
[23:24] mikko	cremes: if you want to coalesce into packets you need a timer
[23:24] cremes	explain why
[23:25] mikko	cremes: let's say i push 2KB to zmq_socket
[23:25] mikko	there 62K in the packet left
[23:25] mikko	right?
[23:25] cremes	with you so far
[23:25] mikko	now let's say i do processing for 10 seconds
[23:25] mikko	and send 100K after that
[23:25] mikko	is the 2KB pending until packet boundary is full?
[23:25] cremes	is the 2kb its own message part or is it a piece of a multipart message?
[23:26] mikko	either way
[23:26] lechon	it should be part of a multipart for the argument to make sense i think.
[23:26] cremes	if it is NOT a message part, it should be sent immediately; no reason to coalesce
[23:26] mikko	so let's say it's part of multipart message
[23:27] mikko	when does the 2K message leave the buffer?
[23:27] cremes	then your example doesn't work; the second/last part (100k) exceeds the max size of a single udp packet
[23:27] mikko	cremes: my message might be larger than udp packet
[23:27] cremes	and i think it's a really bad idea to split 0mq messages over multiple udp pakcets
[23:28] cremes	mikko: if it's too big, pick a different transport; udp is not a good choice
[23:28] mikko	this is why i don't see udp as zeromq transport
[23:28] lechon	what if the second message was 50K?
[23:28] cremes	we agree then
[23:28] mikko	it's still problematic
[23:28] mikko	because if my second part 20K
[23:29] mikko	when would my first 2K leave the buffer?
[23:29] mikko	after the message sequence is complete?
[23:29] cremes	yes
[23:29] lechon	the idea is to never send multipart parts separately
[23:29] cremes	udp transport would be limited to 64k packets; no other transport has a limitation like that
[23:30] mikko	cremes: hence im wondering if it's aligned with the existing transports
[23:30] mikko	UDT and SCTP seem to be more suitable imho but maybe im not seeing the benefits
[23:31] lechon	why not allow all of the udp issues to leak to the user?
[23:32] lechon	if things are out of order, users problem.
[23:32] lechon	if a piece of the packet drops, users problem
[23:32] lechon	etc.
[23:32] mikko	lechon: because thats one of the thing zeromq2 does for you, abstracts all that away from user
[23:32] mikko	well, not in all cases
[23:33] mikko	but the idea is not to care about the network transport
[23:33] cremes	exactly
[23:33] mikko	rather think about in terms of messages
[23:34] lechon	"unreliable messages" could be a way to think about it, but i see how that would not allow interchangeable transports (unless apps were written assuming unreliability)
[23:36] mikko	maybe there is some merit to fire-and-forget type messages
[23:36] lechon	another option could be to disallow multipart messages with udp
[23:37] mikko	your message could still be very large
[23:38] mikko	so the limitation of 64K per message + no multipart would require user to worry about transport type quite a lot
[23:38] lechon	without multipart, would 64K be that big of a problem?
[23:38] lechon	if one packet is dropped then the whole message is dropped
[23:39] mikko	hmm
[23:40] mikko	i don't know, somehow i don't see it fitting the model
[23:40] mikko	but maybe have a chat with sustrik_ as well
[23:40] mikko	i need to sleep in any case
[23:40] mikko	good night
[23:43] lechon	night
[23:56] mikko	couldn't sleep
[23:56] mikko	lechon: you could also send your ideas to mailing-list
[23:56] mikko	for wider feedbackd