ZeroMq IRC Log

Thursday October 20, 2011

[Time] Name	Message
[03:19] aroman	is it possible to do async RPC with req/rep? I have a process that is doing some IO bound stuff that takes a while to complete, but I need it to be able to accept additional requests while one is still going.
[03:29] aroman	oh I see now,
[03:29] aroman	dealer/router is exactly addressing my issue
[03:41] mattbillenstein	aroman: I've been looking at this a bit as well â to implement msgpack-rpc
[03:45] aroman	mattbillenstein: ooh, that is very interesting
[03:45] aroman	hadn't heard of msgpack-rpc before, but I could definitely use it.
[03:45] mattbillenstein	I was working on a zeromq - msgpack-rpc implementation
[03:46] aroman	I currently use JSON to power my zeromq RPC system, but JSON is definitely not the bottleneck for my scale -- I just need asynchronous RPC
[03:46] mattbillenstein	I backed out the zeromq part of it after I couldn't really figure out exactly how to handle failure cases
[03:46] mattbillenstein	but still interested in seeing if it's possible to make it all work
[03:47] aroman	i'm sure it is, but zeromq's inner machinations still escape me
[03:47] mattbillenstein	json doesn't handle binary data well
[03:47] mattbillenstein	well
[03:47] aroman	i'm still barely wrapping my head around the DEALER syntax
[03:47] aroman	yep, and I'm only sending very small lists or HTML.
[03:47] mattbillenstein	I went about doing this using regular tcp sockets and it seems to work pretty well
[03:48] aroman	(RPC for a Python web scraper doing async http requests)
[03:48] mattbillenstein	gevent or eventlet?
[03:48] aroman	neither, Tornado's async http client
[03:48] mattbillenstein	ah, cool
[03:48] aroman	since pyzmq has tornado support built in
[03:48] mattbillenstein	I was using gevent â sounds about like the same thing
[03:49] mattbillenstein	there is a library for making zmq sockets block in a greenlet â gevent-zeromq
[03:49] aroman	(literally one line of code to make the event loops of tornado and zmq work together)
[03:49] aroman	s/loops/loop/
[03:49] aroman	mattbillenstein: why would one want blocking sockets?
[03:49] aroman	(in an eventloop, I mean)
[03:50] mattbillenstein	well, to the user, it seems like they block, but they just yield to the event loop so something else can run
[03:51] mattbillenstein	does tornado use callbacks?
[03:51] aroman	yes, it's fully asynchronous
[03:53] aroman	actually, it looks like I don't really need zeromq at all
[03:54] aroman	msgpack-rpc is solving literally exactly what I need -- node.js async, parallel pipelined RPC to python.
[03:54] mattbillenstein	word â never been a fan of writing code using callbacks â big fan of eventlet/gevent
[03:54] mattbillenstein	I implemented that interface
[03:54] aroman	node.js RPC to python?
[03:54] mattbillenstein	but I never have more than one message in flight down a single connection at time
[03:55] mattbillenstein	so to make a bunch of async calls, I just spawn a new greenlet which creates a new connection, makes a request - when it's done, it closes the connection and hands back the response
[03:55] mattbillenstein	no, nothing with node
[03:55] mattbillenstein	msgpack-rpc is just a specification
[03:55] mattbillenstein	I'm doing python <â> python now
[03:55] mattbillenstein	but I may use the same machinery to do python <â> c++ soon
[03:55] aroman	ah
[03:56] aroman	my project is mostly aimed at getting me more familiar with zeromq/node/socket.io etc, so I'm kind of reluctant to simply bail on zeromq
[03:56] aroman	but it definitely seems that i'd be reinventing a lesser wheel as compared to using msgpack-rpc
[03:56] mattbillenstein	well, and I use connection pools, so I'm not necessarily creating a new connection each time â so I think it's pretty decent
[03:57] mattbillenstein	well, the tradeoff is tcp vs zeromq
[03:57] mattbillenstein	you can use msgpack in either case
[03:57] aroman	what do you mean?
[03:58] mattbillenstein	like msgpack is basically json
[03:58] mattbillenstein	it's how you encode the messages on the wire
[03:58] mattbillenstein	the wire in this case could be either a tcp connection, or zeromq, or something else entirely
[03:58] aroman	ohh
[03:59] aroman	so zeromq is not mutually exclusive to msgpack
[03:59] aroman	zeromq being the transport and msgpack being the data format
[03:59] mattbillenstein	exactly
[03:59] aroman	which goes back to your original statement about trying to get the two working together
[03:59] mattbillenstein	I started with zeromq and msgpack
[03:59] mattbillenstein	but replaced zeromq with tcp
[03:59] aroman	ah.
[04:00] mattbillenstein	think I just don't understand zeromq well enough and i wanted to get this code out into production
[04:00] mattbillenstein	but I might try again sooner or later
[04:01] aroman	yeah. I have no actual deadline or concrete end-goal here, so I have the luxury of evaluating my options
[04:02] aroman	so maybe what i'll do is use zeromq's DEALER pattern with JSON (since I'm already using JSON, just with sync req/rep sockets) and then see about moving to msgpack
[04:03] mattbillenstein	cool, I'd be interested to hear how that turns out
[13:31] rays	any idea why this test fails? https://github.com/smira/txZMQ/pull/3
[14:31] mikko	rays: that is a hefty test
[15:53] cremes	rays: what does that code output? when it "fails" does it print an assertion, hang, ... ?
[15:53] cremes	if it hangs, can you attach with gdb and get a backtrace?
[17:03] cratores	howdy all, getting: Assertion failed: (msg_->flags \| ZMQ_MSG_MASK) == 0xff (zmq.cpp:223) on a moderately high traffic app ( about 500k msg/s 15B each) using only inproc:// endpoints between threads, is this worth submitting a tkt or is it likely me (and if so any tips on debugging)? zmq 2.1.9
[17:03] cratores	this happens randomly
[17:03] cratores	it will sometimes happen reasonably fast and sometimes the app will run for 15 min
[17:04] cratores	sorry not 2.1.9, 2.1.10
[17:04] cratores	forgot i updated today to see if it would help (does not)
[17:12] mikko	cratores: what sort of socket types?
[17:13] mikko	cratores: is this reproducible in a smaller test case?
[17:14] mikko	cratores: that looks like an invalid message is being accessed
[17:14] mikko	for example one that has been closed
[17:15] mikko	brb
[17:19] cratores	i haven't tried reproducing, but i will if i can't figurei t out. the sockets are pub/sub and there's only 1 of each (so maybe I should use pairs)
[17:19] cratores	the thing is i don't know how i would be accessing an invalid msg, id on't do anything fancy like no copy
[17:20] cratores	all i do is send/recv with locally created msgs (on the stack, not heap)
[17:21] cremes	cratores: accessing messages on the stack might be your issue
[17:21] cremes	any chance you can easily change your code to allocate messages on the heap instead?
[17:21] cratores	but i copy all the data i need before they go away?
[17:22] cratores	all i'm saying is I use them like: func(x,y,z) { zmq:msg_t gets used here and all of its data copied to my own structures }
[17:23] cratores	i could just be doing something stupid elsewhere and corrupting of course, so there may not be anything you guys can tell me
[17:24] cratores	it's just such a pain to debug multithreaded i/o using gdb, of course the problem doesn't occur in debug since it runs 50x slower
[17:24] cremes	cratores: valgrind it... you might be doing something funky that valgrind can catch
[17:24] cratores	yeah
[17:27] guido_g	messages on the stack is a bad idea, afair inproc just copies pointers, so the messages might be already gond/garbled when the receiver accesses them
[17:30] cratores	sorry i'm not being clear
[17:30] cratores	i'm not passing them on the stack
[17:31] cratores	i'm creating them in a function, receiving data into them
[17:31] cratores	locally
[17:31] cratores	and then copying what i need before the function is done
[17:32] cratores	like func() { msg m; socket.recv(m); copy_stuff_from_m(m); }
[17:32] cratores	anyway, i'm going to try valgrind
[18:03] guido_g	and on the ending side?
[18:26] mikko	guido_g: the message data is refcounted
[18:26] mikko	stack is perfectly fine for the message container
[18:28] mikko	zmq_msg_init () allocates the contents on the heap
[18:28] mikko	unless the message is a VSM
[18:29] mikko	which i can't remember how it's handled
[18:37] guido_g	ok
[18:37] guido_g	is a 15 byte msg a vsm?
[18:40] mikko	guido_g: probably
[18:41] mikko	zmq_msg_init_data might not be ok from stack
[18:41] guido_g	ahh thx
[18:42] mikko	ZMQ_MAX_VSM_SIZE 30
[18:43] guido_g	so we need to know where the message data resides
[18:49] cratores	on both sides i create local zmq_msgs (sorry i use c++ wrapper so these are zmq::msg_t's), recv from the socket, then copy from the msg onto my own structures all in the same function
[18:49] cratores	this not zero copy, why would this not be ok?
[18:50] cratores	i do this all over this place in apps that run over 1mm msgs a second on Windows and don't have any problems
[18:50] cratores	i'm just porting my apps to linux now and running into issues
[18:50] mikko	cratores: it's ok
[18:51] guido_g	w/o code impossible to tell
[18:51] mikko	cratores: it would be beneficial to run the app under valgrind
[18:51] mikko	to see if there are memory errors somewhere
[18:51] cratores	yes i'm installing it now
[18:52] cratores	never used it before since i'm mostly a windows developer, so i'm learning
[18:52] mikko	cratores: also, if you can put the relevant parts to gist.github.com it would help
[18:52] cratores	will keep the channel posted
[18:52] mikko	i'm gonna hit the gym, will be back in 2h or so
[19:04] espeed	when using a Streamer Push/Pull device, sometimes messages are lost. For example, a client will use TCP to push 1000 messages to an internal worker on the same host, and the worker will only receive ~108 of them. HWM is set to default. Is this likely a buffer overrun issue? There is plenty of memory on the system.
[19:09] cremes	espeed: are you sure all of the PULLers are connected before you start pushing?
[19:10] espeed	I set it to only 1 puller, and it received the first through 108 messages, and it is still running in background
[19:10] cremes	espeed: nevermind... PUSH is supposed to block if there are no PULL sockets connected
[19:11] espeed	The first time I ran it, it receive all 100 messages
[19:11] cremes	espeed: if this is a real bug, then a simple reproduction ought to be simple to make
[19:11] espeed	subsquent times the quantity received fluctuated
[19:11] espeed	*all 1000
[19:11] cremes	what OS and what version of 0mq?
[19:11] cremes	and what bindings?
[19:12] cratores	valgrind detects nothing btw :(
[19:13] cremes	cratores: :(
[19:14] cratores	well i can easily enough try reproducing in a tiny case in 20 lines of code, but i'm sure it will work of course ;)
[19:14] cremes	of course!
[19:14] cratores	part of the problem is that valgrind slows down my app to the point that it may not trigger whatever is causing the problem
[19:15] espeed	pyzmq, ZeroMQ 2.1.4 on Fedora
[19:20] espeed	cremes: let me make sure I understand how 0mq is supposed to buffer -- an available PULL socket will buffer under the limits of available memory and then if no other PULL sockets are available, the messages are discarded?
[19:21] cremes	hmmm
[19:21] cratores	you mean the push socket will suffer, yes?
[19:21] cremes	PULL sockets can receive as many messages as memory allows
[19:21] cratores	buffer
[19:22] cremes	the PUSH socket and its HWM determine how many messages the "worst" PULL socket can be behind
[19:22] cremes	before it blocks
[19:22] cremes	by default, HWM is 0 which means "no limit"
[19:22] cremes	does that answer your question?
[19:22] espeed	yes
[19:24] espeed	ok, so then if there is ample GBs of memory on the system, and the PULL socket is "losing" 900 "hello" messages out of 1000, then that's not normal
[19:25] espeed	it's so intermittent as to how many messages it will receive
[19:32] rbancroft	are you using an up to date version?
[19:35] espeed	Here's my test code... Streamer Device: https://gist.github.com/1302075 , Client & Worker Tester: https://gist.github.com/1302081
[21:06] cremes	espeed: i think i know why you are getting odd results
[21:07] cremes	espeed: you aren't closing the sockets and terminating the context at the conclusion of the program
[21:07] cremes	so it exits before all messages can be delivered
[21:07] cremes	if you used ZMQ_LINGER, closed the sockets when the work was done, and terminated
[21:07] cremes	the context to make sure everything was flushed, then you would probably get all messages every time
[21:26] espeed	cremes: ahh, thanks -- that's what I was just investigating. However, the guide says "When you use ÃMQ in a language like Python, stuff gets automatically freed for you"
[21:27] cremes	espeed: sure, but it doesn't say "stuff automatically gets sent for you"
[21:36] mikko	flushing is not always enough
[21:36] mikko	in some cases things are in tcp buffers
[21:36] mikko	which might get discarded
[21:37] mikko	see https://zeromq.jira.com/browse/LIBZMQ-160
[21:37] mikko	not sure if that's the case here
[21:38] mikko	haven't read the whole backlog yet
[21:56] sigmonsay	Hello, I'm attempting to call recv_pyobj() on a ipc and am getting an exception about cannot be accomplished in current state
[21:56] mikko	sigmonsay: req/rep?
[21:56] sigmonsay	Yah
[21:57] mikko	request - reply follow a strict pattern
[21:57] sigmonsay	I thought the intent was to blcok if the other side wasn't there
[21:57] mikko	so request socket is always send/recv/send/recv
[21:57] mikko	and reply is always recv/send/recv/send
[21:57] mikko	etc
[21:58] sigmonsay	Ok, perhaps I have a bug. Just didn't expect an error of this nature
[21:58] mikko	if the operation cannot be accomplished in the current state (EFSM) then check that you are following the pattern strictly
[21:59] sigmonsay	ty sir, that was it!
[21:59] mikko	np
[22:08] espeed	set LINGER to 0 on all sockets, changed from TCP to IPC on both, closing all client and worker sockets on termination, and destroying client and worker contexts -- but the problem persists
[22:17] sigmonsay	in python does a zmq context create a process or a thread?
[22:19] minrk	sigmonsay: no
[22:21] sigmonsay	I see two processes forked off of the main sitting in epoll loops, they're not mine afaik
[22:21] minrk	the ProcessDevice in the example means run zmq_device in a Process
[22:21] minrk	but creating a Context does approximately nothing but call zmq_init
[22:27] sigmonsay	well does calling bind or connect call fork? =P
[22:29] sigmonsay	Yah, that's exactly what is happening
[22:29] cremes	sigmonsay: no, zmq_bind() and zmq_connect() do not call fork
[22:29] sigmonsay	in python perhaps?
[22:29] cremes	but the python library could be doing so
[22:29] minrk	no
[22:31] sigmonsay	ps doesn't lie, its a thread or a new process. I'll RTFM
[22:31] minrk	how many processes are there?
[22:31] sigmonsay	3 total, one process w/ 1 zmq context kicks off 2 children
[22:32] minrk	and what's the code?
[22:34] sigmonsay	drop dead simple zmq.REP socket
[22:35] minrk	Is is possible to post actual code that sees this fork?
[22:37] sigmonsay	they are likely threads then
[22:37] sigmonsay	Yep, they're threads. Long way to a short answer
[22:41] jjg	hi .. i'm looking for a robust and lightweight http to apmq gateway that keeps a persistent connection to the queue broker .. any tips?
[22:41] sigmonsay	dude, wait a second for an answer before cross posting
[22:42] jjg	sigmonsay: they're called channels not channel
[22:42] mvolkov	Hello, Q: How to find out the volume of queue in a worker? I am using PUSH ... PULL
[22:42] mvolkov	is it even possible?
[22:43] minrk	@sigmonsay - I think zmq_init does spawn C-threads, but pyzmq definitely does not spawn any processes or Python threads unless you are calling an object named 'Threadâ¦' or 'Process...'
[22:44] sigmonsay	That's what i'm seeing, still newb here. thanks mikko
[22:44] cremes	mvolkov: no, that information is not exposed
[22:44] cremes	ok all, on my way to the chicago meetup
[22:45] mvolkov	cremes: thank you
[22:46] rem7	Did the structure of msg envelopes change for zeromq3? Im playing address-based routing (ROUTER<->REP) My code seems to work with zeromq2.1.7 but not in 3. It seems it doesn't like the empty msg between the identity and the workload.
[22:48] jjg	seb`: will do, tx
[22:49] rem7	if I remove the empty msg then it works in ver3, but for ver2 I need to put the empty msg... I just want to make sure if thats normal...?
[23:29] espeed	I found that if I include a sleep(5) at the end of the client function, it stays active long enough for the socket to finish sending the messages to the worker
[23:30] nag	espeed: surely that's not ideal
[23:31] espeed	no, def not, but now I have a better understanding of where the problem is

ZeroMq Home

Thursday October 20, 2011