ZeroMq IRC Log

Thursday December 9, 2010

[Time] Name	Message
[00:00] oxff	how can i get all FDs from a zeromq socket?
[00:00] oxff	say, i want to embed zeromq into my existing libev app
[00:01] oxff	seems that the only way to use zeromq asynchronously is to develop around zmq_poll
[00:04] cremes	oxff: read up on ZM_FD and ZM_EVENTS
[00:04] cremes	that is part of the 2.1 release (in beta) which makes it easy to use with other libraries
[00:04] oxff	ah thanks
[00:04] cremes	and zmq_poll is built on top of those primitives
[00:05] oxff	wait it only gives me one fd per zmq_socket?
[00:05] oxff	how does it work with a bind socket?
[00:06] oxff	or multiple connected sockets?
[00:07] oxff	does it create a fake fd/
[00:07] oxff	?
[00:13] oxff	looking at tcp_listener.cpp, it only returns the bound fd in get_fd
[00:13] oxff	too tired to read all of the source
[00:19] oxff	wait, does zeromq internally use threading / create threads for accepting new connectins etc?
[00:22] oxff	how can i issue a non-blocking connect with zmq_connect?
[00:28] oxff	lol you
[00:28] oxff	you're threading and the fd is just some pipe fd you accumulate to?
[00:28] oxff	this is bullshit
[01:50] testuser	hello
[08:46] rrg	hi
[08:49] rrg	i am unsure how to implement multiplexed sending using a PUB socket. to be specific: a changing number of producers write multipart messages of varying length, at verying times and from varying threads. do i need to synchronize the writing somehow and abandon the multipart messaging alltogether?
[08:54] Steve-o	give each thread a 0mq socket then ensure each thread sends multi-part atomically
[08:57] Steve-o	you could always add extra sockets, i.e. one per session per thread too
[08:58] mikko	Steve-o: im getting openpgm compile failure on x86 solaris 10
[08:58] Steve-o	only tried opensolaris on intel
[08:59] Steve-o	have you updated the compiler flags?
[08:59] Steve-o	>> http://code.google.com/p/openpgm/source/browse/trunk/openpgm/pgm/SConstruct.OpenSolaris
[09:00] mikko	Steve-o: i have
[09:00] mikko	sec
[09:01] mikko	https://gist.github.com/123f088e350c12744e1c
[09:01] mikko	i was working on getting --with-pgm build to work on solaris10 using sun studio last night
[09:01] Steve-o	ok, x86/Solaris is pretty much a dead platform
[09:01] mikko	had to remove strict flag because __PRETTY_FUNCTION__ is not declared on strict compliance mode
[09:02] Steve-o	that's gcc only ?
[09:02] mikko	-Xc mode in solaris studio drops standards incompliant things
[09:02] mikko	__PRETTY_FUNCTION__ is "pseudo-standard"
[09:02] Steve-o	it should be picking up __func__ instead
[09:03] Steve-o	just checked, __PRETTY_FUNCTION__ should only be used with __GNUC__ defined
[09:04] Steve-o	I don't even bother with __func__, I think that's only SunPro
[09:04] Steve-o	but your error is assembler
[09:04] mikko	__func__ is supported by all of them
[09:04] mikko	i guess it would be fairly easy to ifdef OPENPGM_FUNCTION or similar
[09:05] mikko	http://blogs.sun.com/solarisdev/entry/new_article_prefetching
[09:05] Steve-o	ok i see the problem
[09:06] Steve-o	I assume sun = sparc
[09:06] Steve-o	which is usually pretty good considering the other options are awful
[09:06] Steve-o	sunpro on linux doesn't define sun
[09:07] Steve-o	so what does "sun" mean?
[09:07] Steve-o	it's not there for opensolaris
[09:08] mikko	sun and __sun are defined for x86 solaris
[09:08] mikko	sparc and __sparc for SPARC
[09:08] mikko	as far as i know
[09:09] mikko	i dont have access to sparc
[09:09] Steve-o	sparc is for ultrasparc and older
[09:09] mikko	http://predef.sourceforge.net/preos.html#sec35
[09:09] mikko	this says that it's for solaris operating system
[09:10] Steve-o	it's __sparcv9 for the box I have
[09:10] Steve-o	which probably makes __sparc a v8
[09:12] Steve-o	ok, have to guess which ones match gcc
[09:12] Steve-o	I don't even use them which makes it worse
[09:13] Steve-o	they're slower on the hardware I have
[09:13] mikko	interesting
[09:13] mikko	CC man page says that 'sun' is sparc only macro
[09:14] mikko	but that might be from time before solaris studio on x86
[09:14] mikko	not sure
[09:14] Steve-o	x86 solaris is really old, but yes there is a lot of confusion
[09:14] Steve-o	looks like prefetch -> sun_prefetch_read_many
[09:14] Steve-o	prefetchw -> sun_prefetch_write_many
[09:15] Steve-o	I'll have to switch on my sparc and check
[09:15] rrg	Steve-o: the cost of adding a socket is insignificant?
[09:15] Steve-o	I think I have a few fixes to back port to 5.0 too anyway
[09:16] mikko	Steve-o: i use +w -Xc to compile with sun studio
[09:16] mikko	if you want to test
[09:16] Steve-o	rrg: 0mq automagically manages the multiplexing, so once one underlying transport is setup it is cheap
[09:18] rrg	Steve-o: by setup you mean the context initialization?
[09:18] Steve-o	rrg: the IO thread(s) open up the TCP/IPC/PGM sockets in the background
[09:19] rrg	ah, ok. so context init = transport init
[09:20] rrg	Steve-o: but won't binding the socket work exactly once? in case of tcp/ip unicast?
[09:21] Steve-o	i'm going on what the whitepapers on the architecture say :-)
[09:24] rrg	Steve-o: ? -> /me ?
[09:27] Steve-o	brb, a couple of things working on
[09:29] guido_g	rrg: the context has nothing to do w/ a particular transport
[09:34] guido_g	icp?
[09:35] rrg	ipc
[09:35] rrg	guido_g: seems to work!
[09:37] rrg	guido_g: ok, i have single thread now which binds a pub socket to tcp and sits in a recv loop on an ipc pull socket.
[09:38] rrg	guide_g: plus a certain amount of threads which send multipart messages to thread-specific ipc push sockets which they close when they terminate.
[09:38] rrg	guido_g: seems to work fine with 2 context i/o threads. any antipattern i'm using here?
[09:39] guido_g	2 contexts? seems fishy
[09:39] Steve-o	mikko: aaagh, that strict mode is annoying
[09:41] Steve-o	I'm not even getting "sun" defined
[09:41] toni	hey there. I have a XREQ client connected to a set of XREP servers. When one server dies, the client still tries to send messages to it. Is this the default zmq behavior, or am I doing something wrong?
[09:42] Steve-o	mikko: so strict mode presumably means only __sun is defined? always the underscore requirement?
[09:43] rrg	guido_g: 2 i/o threads, 1 ctx
[09:44] guido_g	rrg: i dont think you need 2 io threads, it's a waste of ressources
[09:46] rrg	guido_g: is there any usecase for n>1 threads?
[09:47] guido_g	rrg: no, not that i know of
[09:48] rrg	guido_g: ok
[09:49] rrg	guido_g: ty
[09:52] toni	I described my issue in detail here: https://gist.github.com/734545
[09:59] Steve-o	toni_: send onto the list too
[09:59] guido_g	toni_: but first write a working test, please
[09:59] toni	Steve-o: I did this. The working test is also linked
[09:59] guido_g	toni_: ahh... and remove the useles recursion
[10:00] guido_g	toni_: there're three code snippes that are obviously part of something else
[10:00] toni	guido_g: the snippets I linked ?
[10:01] guido_g	toni_: what else?
[10:01] toni	guido_g: no thats exactly my code and my problem
[10:02] guido_g	toni_: why didn't you post the code as it is then? what you posted appears to be 3 unrelated snippets
[10:02] toni	guido_g: My aproach is, that when a server does not answer for two seconds, I resend the message
[10:03] toni	guido_g: But the XREPServer and the XREQConnect are both used in the Testcase
[10:04] Steve-o	mikko: I have trunk working with sun one in strict mode now, will have to try ICC later
[10:06] toni	guido_g: in the Testcase I first start one server, than the second, then the third. Whenever one server is not available, my XREQ send method runs into the timeout as it does not get any reply for 2 seconds
[10:07] toni	guido_g: and in this case I resend the message
[10:07] guido_g	toni_: i know, i can read
[10:09] toni	guido_g: What I would need, is to tell the XREQ at XREQConnect to disconnect from the addresses that are obviously broken
[10:10] guido_g	toni_: Ã¸mq should do that, if not, show a comprehensible working test case and file an issue
[10:13] toni	guido_g: Okay, Ill open up an issue. Seems to me as if it s not working properly
[10:14] guido_g	toni_: make your test actually working and as small as possible, otherwise it'll be difficult to proof your point
[10:15] toni	guido_g: yes I will. Thanks for your hints anyway.
[10:49] toni	One question remains. In zmq I can connect a socket to some addresses (that never were bound before), send messages to it and they will be buffered until the addresses become available. So consider one client and two servers. The client connects to both. The messages are loadbalanced between the servers. When one server dies (s.close()) the client will try to send its message to that address. As it is not available they are buffered. A
[11:45] sustrik	drbobbeaty: hi
[11:45] sustrik	would you do the -1 default, or should i?
[11:47] drbobbeaty	sustrik: I'll be glad to. I just need to get a few things done this morning and then hamer it out.
[11:47] sustrik	great
[11:47] drbobbeaty	I want to build RPMs, etc. to verify etc. So it'll take me just a little bit.
[11:47] sustrik	sure, no haste
[11:52] toni	I posted my problem here with a minimal testcase: https://github.com/zeromq/pyzmq/issues/issue/51
[11:57] guido_g	toni_: does it work if you start both servers and then kill one of them?
[11:59] guido_g	toni_: of course, after the client connected to them
[12:02] toni	guido_g: no, when I start both servers, then starting the client e.g sending 100 000 msgs and then I kill one server, the client still tries to send to the death one
[12:03] guido_g	ouch
[12:04] guido_g	sounds really like a bug
[12:04] toni	The testcase is quite simple, I hope everyone can reproduce this behavior
[12:10] guido_g	jepp, killing one of the servers causes a delay every other request
[12:13] toni	guido_g: but actually, zmq should behave different and notice the death server?
[12:13] guido_g	should be in the docs
[12:15] guido_g	ahhh.. got it
[12:16] guido_g	set the HWM for the sending socket to 1, then it works
[12:19] toni	guido_g: You are my hero !!!
[12:19] toni	guido_g: hanging on this issue for almost 2 day now
[12:19] toni	guido_g: Thanks a lot for your help!!!
[12:19] guido_g	the reason is -- i think -- the unbound queue for this endpoint
[12:20] toni	guido_g: what do you mean by the unbound queue?
[12:20] toni	guido_g: As far as I understood the HWM limits the send-buffer of the socket to the size of 1
[12:20] guido_g	every endpoint under a Ã¸mq socket has a queue for messages to be sent to it
[12:21] toni	ah okay, thats what I just called buffer...
[12:21] guido_g	not of the Ã¸mq socket, but the underlying endpoint
[12:23] toni	guido_g: thanks for your help. Great project and great community
[13:38] bobdole369	Hello good morning, is zeromq used often with embedded devices? Specifically Rabbit Semiconductor stuff, or Schnieder (or others) PLC's?
[15:17] mikko	bobdole369: it might make sense to ask that question on the mailing-lists to reach bigger audience
[15:21] Steve-o	mikko: what strict parameters are you using for ICC? I'll check tomorrow and update trunk for PGM
[15:31] mikko	Steve-o: sec
[15:31] mikko	-strict-ansi
[15:43] Steve-o	taa, will test with 5.1 and backport to 5.0
[15:43] mikko	cool
[15:44] Steve-o	only OSX needs 5.1 because of the lack of spin locks
[15:44] Steve-o	maybe something with Win x64 too but I can't recall
[16:25] mikko	sustrik: there?
[17:04] sustrik	mikko: hi
[17:05] mikko	sustrik: im having issues with ZMQ_QUEUE
[17:05] sustrik	yes?
[17:05] mikko	it seems that my messages are not being closed
[17:05] mikko	it's a bit tricky to reproduce
[17:05] mikko	but i got front=pull, back=push device
[17:06] mikko	and if i am not consuming the message the memory usage goes up steadily
[17:06] mikko	if i then start consumer the memory usage doesn't go down
[17:06] mikko	even though messages are flowing
[17:06] mikko	i can see using valgrind that memory is held in the messages
[17:06] sustrik	are you publishing at full spead?
[17:06] sustrik	speed
[17:07] mikko	yes
[17:07] mikko	is it expected that memory usage never goes down?
[17:08] sustrik	well, if receiver is slower than publisher it can happen
[17:08] mikko	but even if i stop publisher and start consuming
[17:09] mikko	it seems that usage does not go down
[17:12] sustrik	that looks like a leak
[17:12] sustrik	can you report the problem along with the test program?
[17:13] mikko	does this make sense:
[17:13] mikko	i generate 10k messages, then turn consumer on/off/on/off
[17:13] mikko	until no more messages flow through
[17:13] mikko	kinda simulating flaky network
[17:13] mikko	and at the end it seems that not all messages get freed
[17:13] mikko	but there is no more coming from pipe
[17:14] sustrik	how do you know they weren't freed?
[17:15] mikko	valgrind show them in "still reachable" memory
[17:18] mikko	it might be something in my app as well
[17:18] mikko	i will investigate further and open a ticket if it's somewhat reproducible
[17:36] sustrik	mikko: how did you shut down the device?
[17:37] mikko	sustrik: ctrl + c
[17:37] sustrik	let me check the code...
[17:37] sustrik	master?
[17:37] mikko	yes
[17:37] mikko	it could very well be my test code
[17:38] mikko	i need a statistics collection over http
[17:38] mikko	so im writing a small webserver that distributes the messages to backend nodes for processing
[17:43] sustrik	mikko: aha
[17:43] sustrik	the devices are not shutting down decently
[17:43] sustrik	if (errno == ETERM)
[17:43] sustrik	return -1;
[17:43] sustrik	what they should do, i guess
[17:44] sustrik	is to set LINGER to 0
[17:44] sustrik	and terminate via zmq_term()
[18:51] delaney	are there any examples of the proper use of socket.RecvAll() for C#?
[18:54] sustrik	delaney: i don't think there are much examples for clrzmq2, try asking on the mailing list
[18:54] delaney	k
[20:26] sustrik	mikko: hm, i was wrong
[20:27] sustrik	Ctrl+C actually kills the program, thus not freeing the allocated memory
[20:27] sustrik	so valgrind would naturally report leaks
[20:27] sustrik	alternative would be to install SIGINT handler and try to clean-up before exiting
[20:45] CIA-20	zeromq2: 03Bob Beaty 07master * rfcfad56 10/ (6 files in 3 dirs): (log message trimmed)
[20:45] CIA-20	zeromq2: Added Recovery Interval in Milliseconds
[20:45] CIA-20	zeromq2: For very high-speed message systems, the memory used for recovery can get to
[20:45] CIA-20	zeromq2: be very large. The corrent limitation on that reduction is the ZMQ_RECOVERY_IVL
[20:45] CIA-20	zeromq2: of 1 sec. I added in an additional option ZMQ_RECOVERY_IVL_MSEC, which is the
[20:45] CIA-20	zeromq2: Recovery Interval in milliseconds. If used, this will override the previous
[20:45] CIA-20	zeromq2: one, and allow you to set a sub-second recovery interval. If not set, the
[20:45] CIA-20	zeromq2: 03Martin Sustrik 07master * ra9d969a 10/ AUTHORS :
[20:45] CIA-20	zeromq2: Bob Beaty added to the AUTHORS file
[20:45] CIA-20	zeromq2: Signed-off-by: Martin Sustrik <sustrik@250bpm.com> - http://bit.ly/hqM7T8
[21:41] mikko	sustrik_: the thing i was seeing is that the memory just 'sits' there
[21:41] mikko	and doesn't go down during when the program runs
[21:46] sustrik	that's the OS problem AFAIK
[21:46] sustrik	OS doesn't return deallocated memory back to the shared pool
[21:46] sustrik	rather the memory is kept by the running process
[21:47] sustrik	Solaris claims that it's able to reuse the memory, but my tests haven't confirmed it
[21:49] mikko	i shouldn't see that memory in still reachable in valgrind though
[21:49] sustrik	right, that's interesting
[21:49] mikko	but it might've been an application as i don't seem to be able to reproduce after a couple of hours of refactoring
[21:50] sustrik	yes, this kind of problems is pretty hard to track down
[21:52] sustrik	if you are able to reproduce in the future we can give it a try though
[21:54] mikko	cool
[21:54] mikko	the tool feels pretty stable now
[21:54] mikko	can push a bit over 3k http requests per second on my virtual machine which seems more than enough
[22:02] sustrik	:)