ZeroMq IRC Log

Tuesday June 7, 2011

[Time] Name	Message
[01:15] Seta00	mikko, have you seen this? http://www.cs.wisc.edu/graphics/Gallery/VirtualVideography/
[01:15] Seta00	I think that would couple extremely well with your Lecture Recorder
[01:39] yukonbob	hello, #zeromq
[02:35] Seta00	hello yukonbob
[02:49] yukonbob	:)
[02:49] yukonbob	doesn't seem to active in here in my visits over last few days... people here mostly knowledgable and trouble-free?
[03:01] michelp	yukonbob, the activity comes in bursts
[03:02] michelp	stick around
[03:02] yukonbob	am just getting started w/ 0mq -- never got into it all this time, but last couple days digging in... very fun.
[03:03] michelp	yeah same here, i've been working with it for a couple weeks. I'll never write another server again, i'll just use 0mq ;)
[03:05] michelp	have you read the guide?
[03:05] yukonbob	michelp: have not, should.
[03:05] michelp	there are no public services that i know about, you can easily create your own service and connect to it
[03:05] yukonbob	I've just been browsing the API and examples.
[03:05] michelp	the guide has numerous examples
[03:05] michelp	the man pages are also very good
[03:05] yukonbob	michelp: I'd like a third party service, so I'm not working my own own potential mistakes...
[03:06] michelp	well i've never heard of one. i don't think you'll have much option but to run your own services
[03:06] michelp	or run an existing framework like mongrel
[03:06] michelp	and connect to it with your code
[03:07] michelp	http://mongrel2.org/home
[03:07] yukonbob	0mq deemed suitable for transporting over internet, or is dumb idea?
[03:07] michelp	there's a wiki page on that subject
[03:08] michelp	don't remember exactly where it is
[03:08] michelp	0mq is not more secure than tcp, you have to build in your own security
[03:08] michelp	how hard you work at that is up to you
[03:08] michelp	sure
[03:08] michelp	i have a little time
[04:06] yukonbob	hehe: We'll generate random values, just like the real weather stations do.
[04:06] Toba	weather stations = number stations?
[04:07] michelp	the guide is good :)
[05:42] CIA-31	pyzmq: 03MinRK 07master * r3435e19 10/ zmq/eventloop/ioloop.py : Don't exit loop on EINTR ...
[06:17] CIA-31	pyzmq: 03MinRK 07master * rcc52204 10/ (15 files in 3 dirs): rename czmq.pxd to libzmq.pxd, to avoid confusion with czmq project (high-level C API) - http://bit.ly/kyRads
[08:00] pieter_hintjens	hi folks
[08:01] aurojit	pieter_hintjens: hi
[08:01] eintr	top of the morning, gentlemen
[08:01] pieterh	hi eintr, aurojit
[08:31] eintr	this would make for helluva titanic: http://code.google.com/p/leveldb/ ... makes me want to build something weird and awesome.
[08:37] pieterh	eintr: looks like it'd be a great fit with 0MQ
[08:39] eintr	pieterh: yeah, i think so. although overkill if all you need is logging / durability. IF you add something that would actually work on logged messages however, then it should be nifty.
[08:40] eintr	"which of N workers processed message M within timeframe T" style questions come to mind.
[08:40] pieterh	eintr: mixed with something like Titanic, it could be neat
[08:40] eintr	mm, distributed brokerage :)
[08:47] mikko	Seta00: haven't
[08:47] mikko	Seta00: seems to be inactive
[09:59] masterzen	Hi. I'm watching Zed Shaw presentation right now, and the first thing he said was: "don't put 0mq on the Internet". Is this still true? Can't we use 0mq to build remote clients/server stuff where we don't control where clients run?
[10:05] mikko	let's say that it's getting better
[10:10] masterzen	mikko: thanks, in what version does it get better (ie 2.1 or in the 3.0 branch)?
[10:10] mikko	2.1 is the current stable branch
[10:10] mikko	3.0 is going through a lot of changes
[10:16] pieterh	masterzen: there are some places 0MQ will die if fed bad data, but that's not the real issue with Internet use
[10:17] pieterh	the more profound issue is one of security, IMO
[10:21] mikko	it's currently relatively easy to OOM zeromq program
[10:21] mikko	if you know what you are doing
[11:00] masterzen	pieterh: ok thanks, can you elaborate about the security?
[12:09] sunoano	no built-in transport layer encryption ie you have to take care of that yourself eg using VPN
[12:25] private_meta	ok, so compiling the same program on redhat ruins my messages, compiling/running it on ubuntu works
[12:25] private_meta	I'm looking and looking and can't figure out what actually garbles my messages
[12:26] private_meta	It's like my program rolls a 1d20 and everything but a natural 20 breaks the program
[12:28] mikko	have you tried compiling without opitmisations?
[12:28] private_meta	sure
[12:28] mikko	which version of g++ ?
[12:28] private_meta	works on 4.4 on ubuntu, doesn't work on 4.1.2 and 4.4 on redhat
[12:29] private_meta	but i'll try again with different optimizations
[12:29] mikko	valgrind telling anything?
[12:31] private_meta	valgrind doesn't really help there
[12:34] private_meta	ah ok
[12:34] private_meta	fixed something with compiling with debug mode on the server
[12:35] private_meta	so now, SOME of the messages come through
[12:35] private_meta	at least the same amount as with ubuntu
[12:36] private_meta	I assume it still garbles the >128Byte Messages as it does with Ubuntu though... can't test it right now
[12:36] private_meta	Any idea why the messages would be garbled with debugging disabled or optimization enabled?
[12:41] jsimmons	because you're ruining the memory backing them and changing the compile modes changes how it's laid out in memory
[12:42] jsimmons	so... valgrind!
[12:45] pieterh	masterzen: sorry, went to lunch
[12:45] pieterh	you still around?
[12:45] masterzen	pieterh: yes
[12:45] pieterh	well, any unsecured TCP activity over Internet is problematic
[12:45] pieterh	it's ok for public data, read only
[12:45] masterzen	pieterh: I understand
[12:45] pieterh	anything else, not so much
[12:46] masterzen	pieterh: it's still possible to encapsulate the data in my own encryption/security layer...
[12:46] pieterh	masterzen: yes, indeed
[12:46] pieterh	e.g. what Salt does over 0MQ
[12:47] private_meta	JStoker: would be easier if I knew how to use valgrind that way... Tried looking for tutorials that actually helped and didn't find anything
[12:49] private_meta	erm
[12:59] private_meta	k, i meant jsimmons
[13:15] Seta00	masterzen, I'm currently using 0MQ on the internet, for audio streaming, with full end-to-end encryption
[13:27] masterzen	Seta00: it's interesting. Did you encounter any issues?
[13:29] Seta00	masterzen, except for the gigantic lack of compatibility between different libraries, no :P
[13:29] Seta00	in other words, 0MQ gave me the less amount of issues
[13:30] Seta00	gave me less issues*
[13:30] so_solid_moo	fewer issues ;)
[13:30] Seta00	yeah!
[13:30] masterzen	Seta00: ok, thanks. I'll play with 0mq to give it its chance :)
[14:15] private_meta	Ahahahaha... Running my program with valgrind makes it work, running it WITHOUT valgrind makes it break
[14:15] private_meta	...
[14:23] private_meta	This is beyond frustrating
[15:13] pieterh	private_meta: is it a multithreaded program?
[15:14] private_meta	yes
[15:16] private_meta	pieterh: the receiving is handled by a single thread, the sending can be done by any thread I think
[15:16] pieterh	I'd suspect timing issues
[15:16] pieterh	it's the only thing I know valgrind affects
[15:17] private_meta	pieterh: just added and testing one more thing. If I actually DUMP the message somewhere (to string, file) before sending, it appears to work in this case
[15:18] pieterh	sure, same thing
[15:18] pieterh	it slows down one of the threads
[15:18] pieterh	the way to fix this is to ignore the case where it works
[15:18] pieterh	don't try to figure out why it works, that's unhelpful
[15:18] private_meta	sure, but working cases are important to narrow down what's making it fail
[15:19] pieterh	well, not really ime
[15:19] pieterh	this is a classic problem solving approach
[15:19] pieterh	ignore the "it works" cases and especially trying to compare the two
[15:20] pieterh	what you have is a timing dependency between connects, binds, or messages
[15:20] pieterh	under valgrind, or if you add debug output, the dependency shows, and causes your app to break
[15:21] pieterh	so keep that breakage happening, and slice up the code until you find the culprit
[15:23] private_meta	I have no idea how to do those last two lines you told me
[15:24] pieterh	ok... without knowing anything about your application...
[15:24] private_meta	As -) I have no idea what makes it break, -) I can't really 'slice up the code', -) I don't have half a clue about valgrind and as I stated before, there aren't any tutorials that are helpful for me, -) not quite sure what you mean with the dependencies
[15:24] pieterh	you want to see your code misbehaving
[15:25] pieterh	whatever you mean by "it breaks"
[15:25] private_meta	yeah, call it misbehaving, works for me
[15:25] pieterh	that means not running under valgrind, not doing debug output (sorry, i was getting this the wrong way around)
[15:26] pieterh	now you have it breaking, you remove pieces of the code
[15:26] pieterh	that's the simplest approach, chop pieces off it
[15:26] pieterh	at some point you'll see it suddenly work
[15:27] pieterh	you may be able to figure it out simply by studying the code
[15:27] private_meta	That's my problem right there
[15:27] pieterh	once you realize that the fault is caused by something happening faster than you expect, somewhere
[15:27] private_meta	from the situation I have right now
[15:28] private_meta	Removing code means throwing out 90% of the code required right now, and there is no intermediary version between this and the simple "let's send a message" case
[15:29] pieterh	ok, then, you might try to narrow down which message flow is "going too fast"
[15:29] private_meta	at least none that really works that helps me
[15:29] pieterh	by inserting pauses at different places
[15:32] private_meta	Before plucking it apart, there's something that bugs me. The program flow is set up so the entire connection has to be open for the program to proceed and send the message that fails. I even have to add a "sleep" after initiating the connection or it fails instantly. I just can't quite grasp there WOULD be such a timing issue
[15:34] private_meta	I mean, it's the first message sent that fails. there's a second delay from connection initiation to first message send, which makes me assume it's between the function call where I say "send this string" and the actual creation of a zmsg and it's sending, right now I just can't see why a timing issue can be so relevant there, as in the initial case, the entire thing is pretty "linear", even if it's in a separate thread
[15:34] private_meta	well... I'll try narrowing it down, let's see if i can at least do that any further
[15:38] pieterh	you know that a connect takes ~100msec to happen, in the background, right?
[15:41] private_meta	pieterh: color me ignorant there for not knowing, but I thought messages sent if the connection is not established are, i don't know, kept back til the it's open, and sent then
[15:41] pieterh	it depends on the socket type
[15:42] pieterh	also if you're connecting or binding
[15:42] pieterh	what is the socket type?
[15:42] pieterh	and who is doing the bind vs. connect?
[15:44] mikko	i am
[15:44] mikko	j/k
[15:44] mikko	pieterh: how are you?
[15:44] pieterh	mikko: hey, nice to see you around, I'm fine
[15:44] pieterh	just got back from two weeks in the rainy west coast US
[15:45] mikko	it wasn't rainy when i was there
[15:45] mikko	well, it was rainy at SFO
[15:45] pieterh	yeah, precisely :)
[15:45] mikko	san diego was sunny
[15:45] pieterh	well, that's semi-desert
[15:46] pieterh	we had a good meetup in portland, and a good one in SFO
[15:50] private_meta	pieterh: bind is the one that receives, it receives ok, the one that does the connect is the one that fails (it has 1 sec delay after doing th econnect)
[15:51] pieterh	private_meta: what socket types are you using at bind and at connect?
[15:51] pieterh	and how do you define 'fails'?
[15:53] private_meta	well... that's one thing I forgot to mention. it fails this way: it creates the multipart message pretty much correctly, but the "payload", the part where I actually have the content in, instead of the string it looks like a pointer address, like "7593151fa9c" (random), but as I said, only in those cases.
[15:53] private_meta	about the types
[15:53] pieterh	ah...
[15:53] pieterh	hang on
[15:53] pieterh	:-) next time, it'
[15:53] private_meta	AND
[15:53] pieterh	it'd be worth explaining what you mean by 'fail'
[15:53] private_meta	in addition to that
[15:54] private_meta	in case this is also important:
[15:54] private_meta	the message to be sent is already corrupted like this BEFORE it is sent
[15:54] private_meta	when it is created
[15:54] private_meta	but as I said, only in the cases explained
[15:54] pieterh	private_meta: hang on, we have enough data
[15:54] private_meta	ok
[15:54] private_meta	I really should have explained that sooner apparently
[15:55] pieterh	your sender is presumably destroying the message before its actually sent
[15:55] pieterh	what language are you working in?
[15:55] private_meta	C++
[15:55] pieterh	ok, I've no idea what destructor or deallocator you're using, but...
[15:55] pieterh	this is a classic error
[15:56] pieterh	zmq_send() happens in the background
[15:56] pieterh	meanwhile your main thread destroys or frees the memory the message is in
[15:56] pieterh	result: 0MQ sends garbage
[15:56] pieterh	you can send $0.50 via paypal to the usual address
[15:58] private_meta	Following that logic, may I make an assumption?
[15:58] pieterh	of course
[15:58] private_meta	The larger the message to send is, the longer it takes to prepare/send, so the larger the chance for it to be erased and for the send to "fail".
[15:58] pieterh	nope
[15:59] pieterh	well, maybe
[15:59] pieterh	sending is 1 system call, independent of message size
[15:59] pieterh	if you send many small messages they may be batched into 1 system call
[15:59] private_meta	The thing is, your option does not explain away the other behavior, that after a length of 128 characters, it fails CONSISTENLY, no matter what I do
[15:59] pieterh	sure
[15:59] private_meta	at least according to what I found out
[15:59] pieterh	fix the deallocation issue
[16:00] private_meta	hmm
[16:01] private_meta	I already DO a memcpy into a char pointer before giving the string to zmsg, so there shouldn't be any deallocation from anything but zmsg
[16:02] pieterh	well, can you post a fragment of the sending code?
[16:02] pieterh	you should be able to make a minimal test case, if it's really failing at send time
[16:03] private_meta	"minimal" would still contain 10-15 files
[16:03] private_meta	any kind of minimal that still resembles anything that I'm actually using
[16:10] private_meta	pieterh: the kind of non-working minimal is this https://gist.github.com/c91e37df41d3bb8ccbfe part 1 is the part of the code that creates the message and sends it, the rest should be in a way explanatory, but as said, it might be a bit too minimal
[16:11] pieterh	so here's the problem
[16:11] pieterh	char * msg = new char[size];Â Â Â memcpy(msg, message.data_nb(), size);
[16:11] pieterh	I mean, seriously...?
[16:11] pieterh	price just went up to $1.50
[16:12] pieterh	when is 'msg' destroyed? it's local to the function, no?
[16:13] private_meta	ah... good you brought that up, memory leak, msg is not destroyed
[16:14] private_meta	as the message can't really destroy itself after sending itself, i clear it out after it's sent, but it's not destroyed
[16:14] mikko	private_meta: are you sending just ascii strings?
[16:14] private_meta	mikko: pretty much, yeah
[16:15] mikko	private_meta: line 5 would mess up if you had null characters in the message.data ()
[16:15] pieterh	ok, so new is persistent... sorry, false alarm
[16:15] mikko	new allocates from heap
[16:16] private_meta	well, the zmsg, in this buggy case, lives until the program dies
[16:16] pieterh	private_meta: and how about 'zmq::message_t message;' in send()?
[16:16] private_meta	sec
[16:16] pieterh	that's not persistent
[16:16] pieterh	line 46
[16:16] private_meta	no, it's not
[16:16] pieterh	you're sending from stack
[16:16] pieterh	as soon as the function exits, the stack unrolls
[16:16] pieterh	you get... hmm, stuff that looks like pointers on it
[16:17] pieterh	do not send messages from stack variables
[16:18] private_meta	Ok, so I new the message_t variable, but I'd still have to delete it, and to be consistent I'd still have to do it at the end of each loop iteration. What does it make different there? It still exists as long as the other
[16:19] pieterh	shrug, the correct way is to allow 0MQ to deallocate the buffer
[16:20] pieterh	do you see that sending message_t is probably the cause of (some of your) problems?
[16:20] mikko	pieterh: that construct should be fine
[16:20] mikko	it's similar to creating zmq_msg_t in stack
[16:20] mikko	and sending it
[16:21] pieterh	mikko: it's a 0MQ message type? and properly initialized?
[16:21] pieterh	hmm, ok, false alarm #2
[16:21] pieterh	private_meta: I'm still going to go with "sending stuff that isn't there anymore by the time send() gets around to it"
[16:21] mikko	pieterh: when the object goes out of scope zmq_msg_close is called
[16:22] mikko	pieterh: but if the message is still in-flight the refcount should be >0
[16:22] pieterh	private_meta: there's a magic cut-off at 50 bytes, smaller than that sits in the structure, larger is an allocation
[16:23] pieterh	mikko: I'm not familiar with the C++ API, tbh, can only guess based on the symptoms
[16:24] private_meta	mikko: I built it assuming that behavior, but doing that, how could I avoid behavior described by pieterh, should that be the case?
[16:24] private_meta	pieterh: well, at this moment, the payload (the part of the multipart message that contains the information, has 78 bytes
[16:25] pieterh	private_meta: hmm, can you try to chop this down to a test case, one sender and one receiver?
[16:25] pieterh	sorry, I know that's a detour but it may be the only way to solve this
[16:25] private_meta	Define one sender, one receiver... it practically IS like that right now
[16:25] private_meta	I'm only testing with one sender/one receiver right now
[16:29] private_meta	I've got somewhat of an onion here, basic send, zmsg around that, modified majordomo around that, project-bound client-server classes around that
[16:29] private_meta	pieterh: so I'm not sure to what level you intend me trying to strip it down
[16:30] pieterh	private_meta: stripped down means any lines of code that can go must go
[16:31] pieterh	otherwise other people can't reasonably understand the case
[16:31] pieterh	so you'd at least see if problem was in project classes, MDP classes, or zmsg class
[16:32] private_meta	I'll try to strip it down to mdp only
[16:33] pieterh	ack
[16:51] private_meta	ok, it's more difficult that I thought
[16:52] private_meta	pieterh: I already have to use threads to use the modified majordomo system as is
[16:52] pieterh	ok, so instead of stripping it down...
[16:53] pieterh	you have a specific case where one sender's messages are corrupted
[16:53] private_meta	well, I'll still try to make the majordomo only version work, with a single thread, I'll try
[16:54] private_meta	ugh
[16:54] private_meta	even more difficult
[16:57] private_meta	I can't strip it apart and send/receive a message normally at the same time
[17:03] michelp	morning
[17:05] private_meta	michelp: west coast?
[17:09] michelp	private_meta, yeap
[17:10] private_meta	gotta finally remember it, +9, east coast
[17:10] private_meta	michelp: ah, so you do have Valve Time
[17:10] michelp	valve time?
[17:11] private_meta	Well, Valve Corporation, the gaming company, it's their official time zone, their new day for cheap games always starts at ~10.00 east coast time
[17:12] michelp	ah i see
[17:12] private_meta	sorry for destroying your mental image
[17:12] michelp	nah no problem it reminds me i gotta replace my fuel injection pump :)
[17:12] private_meta	ah, cars are too expensive for me
[17:13] private_meta	Well, not in a "Can't afford" way, more in a "don't want to afford" way
[17:13] michelp	that's why i only drive clunkers
[17:14] michelp	currently rockin' an 84 diesel mazda pickup truck. cost me $1200 :)
[17:14] private_meta	insurance and tax for that kind of car would be quite expensive here
[17:14] michelp	where is there?
[17:14] private_meta	austria
[17:14] michelp	ah nice
[17:14] michelp	i've been to vienna, it was pretty awesome
[17:15] michelp	did a conference there at a huge hall where apparently mozart built the sets for the magic flute
[17:15] michelp	iirc it was adjacent to the opera house or something like that
[17:16] private_meta	tbh I haven't ingested much of the culture of vienna the couple of times I've been there
[17:16] michelp	i did spend a few days in the austrian countryside too, it's amazing what a difference it is from city to country out there
[17:17] michelp	i guess that's the case everywhere
[17:17] private_meta	good or bad?
[17:17] private_meta	or just different?
[17:17] michelp	just different, i like the countryside, but the city obviously had more fun stuff going on
[17:17] private_meta	sure has...
[17:17] michelp	travelling on the train through the rolling hills was nice
[17:18] private_meta	but I guess one of the differences of the countryside in austria compared to the US is that in Austrias countryside, you still have mobile reception ;)
[17:18] michelp	yeah ;) it's getting better here, although my state is still only about %50 covered
[17:18] michelp	i live in a big, less populous state so there's no incentive for them to set up towers in the vast wilderness that is most of Oregon
[17:20] private_meta	It's kinda hard to swallow that almost any US state is larger in size than austria
[17:20] private_meta	-any+every
[17:20] michelp	well, the west coast ones at least
[17:20] michelp	a lot of the eastern states are pretty small
[17:21] private_meta	Everything that's lager than Maine is larger than Austria
[17:21] private_meta	In other words, Austria is about 4k square kilometers larger than Maine
[17:22] michelp	hmm wikipedia indicates the other way around
[17:22] michelp	Maine is a little bigger, maybe i'm reading this wrong
[17:22] private_meta	Maine 79,931 mÂ², Austria 83,855 mÂ²
[17:22] michelp	maine: 91,646 km2 austria: 83,855 km2
[17:22] private_meta	ah... + water is 91k
[17:23] michelp	oh ok i get it
[17:23] michelp	i didn't realize they made a distinction with the water :)
[17:23] private_meta	Now we both do, apparently
[17:24] michelp	well Oregon is well over twice that at 255K, and only 3.5M population, the vast majority of that in the one big city Portland
[17:24] michelp	so the density for most of the state is extremely low
[17:25] private_meta	more space for wolves and bears!
[17:25] michelp	Malheur country for example is 25,610 km2 with only 31K people
[17:26] michelp	never seen a wolf, yet. plenty of bears though.
[17:26] michelp	the wolves are starting to repopulate, so it's only a matter of time
[17:26] private_meta	Everything that manages to scare Stephen Colbert must be awesome
[17:27] private_meta	It's funny how some areas are named all over the world... Malheur County makes it sound like it was an accident
[17:27] michelp	yeah, doesn't it mean Bad Odor? or something like that?
[17:27] private_meta	Well
[17:27] michelp	Bad air?
[17:27] michelp	Bad time?
[17:28] private_meta	In french and german, "malheur" means "mishap"
[17:28] michelp	ah
[17:28] private_meta	Well, then again, Austria has a town called "Fucking"
[17:28] michelp	well it's something like 90% high altitude desert so I can see how the first people who got there figured they took a wrong turn
[17:29] private_meta	Somewhat hilarious, how many times the sign for that town has been stolen already
[17:30] michelp	yeah, funny story about Fucking Austria, our managment wanted us to translate every city name in our database. i told them it didn't work that way, only a few cities in the world have "translations" like "Nuevo York". Most cities do not, so I used Fucking Austria as my example to convince them they were wrong. ;)
[17:30] private_meta	hahaha
[17:30] michelp	it worked perfectly. the best technical solutions are always zero effort :)
[17:31] michelp	Los Angeles is a good one too, what are we going to translate that into English? Yeah I'm flying down to "The Angels" next week...
[17:32] private_meta	Well, It's weird why some towns actually have to be translated
[17:33] private_meta	why can't you just say "Wien", why did it have to be translated to "Vienna"?
[17:33] private_meta	although I have to admit that the early name for Vienna was Vindobona, so a V is actually closer :D
[17:35] private_meta	pieterh: This is like taking a pair of die and rolling them, with the outcome deciding if it works or not
[17:35] pieterh	you mean city names, or debugging messages?
[17:35] private_meta	well, both, in a way, but I meant the latter
[17:36] pieterh	well, timing issues are like that
[17:36] private_meta	I told you that adding a logging message made the error go away
[17:36] pieterh	sure, it delays something long enough
[17:36] private_meta	I did nothing else and then consisntently the erro went away
[17:37] pieterh	sure, that also happens
[17:37] private_meta	now I removed it
[17:37] private_meta	now the error is still gone
[17:37] private_meta	consisntently
[17:37] private_meta	*consistently
[17:37] pieterh	well, not consistently, obviously...
[17:37] private_meta	Well, consistently meaning "no matter how many times i start it now"
[17:37] pieterh	yes
[17:38] pieterh	the best would be then to run it over and over in a loop
[17:38] pieterh	it'll fail every N out of M times
[17:38] taotetek	pieterh: hey, Ken told me he met you last week
[17:38] pieterh	taotetek: hi!
[17:38] taotetek	pieterh: he's who we have working on the rsyslog zeromq plugins
[17:38] private_meta	pieterh: Yeah, I can ship it with a failure probability
[17:38] pieterh	ah, yes, heh
[17:38] taotetek	pieterh: I'm out in California this week he and I will be spending some time working on them more this thursday, woot
[17:39] private_meta	cool... got a new error
[17:39] pieterh	great guy
[17:39] private_meta	basic_string::_S_construct NULL not valid
[17:39] taotetek	pieterh: yeah he is I love working with him
[17:40] pieterh	private_meta: you're working for Microsoft?
[17:40] private_meta	pieterh: hey, don't let things get ugly
[17:40] pieterh	then shipping with failure probability is probably not a great idea
[17:40] pieterh	:)
[17:40] pieterh	i meant, for debugging it
[17:41] private_meta	working for a small research project
[18:13] private_meta	It wouldn't be fun if there weren't a new problem whenever I try to get close to the current one...
[18:14] private_meta	Now I can't Debug the server because of "Interrupted system call" exceptions
[18:14] private_meta	pieterh: I get "Interrupted system call" exception whenever I use ctrl+z or whenever the debugger goes past "zmq::poll", any idea why?
[18:14] private_meta	or maybe a combination thereof
[18:15] pieterh	private_meta: debuggers do weird stuff with signals
[18:15] pieterh	i tend to avoid them
[18:15] private_meta	Well
[18:16] private_meta	If I try to send the Stop Signal, like to put the application into background, the same thing happens
[18:16] private_meta	terminate called after throwing an instance of 'zmq::error_t' what(): Interrupted system call
[18:17] pieterh	private_meta: I've never tried Ctrl-Z on a zmq process... sounds fun
[18:18] private_meta	the reason being to send it to the background
[18:18] private_meta	ctrl+z, followed by a "bg" immeditely
[18:18] private_meta	*immediately
[18:18] pieterh	for sure, it makes sense... let me try that on a test case...
[18:20] private_meta	I'll test some more too
[18:21] pieterh	private_meta: well, it works fine on a C example (Ctrl-Z & bg)
[18:22] private_meta	I fear it's another one of those redhad only problems
[18:23] private_meta	I'll try ubuntu
[18:25] private_meta	pieterh: ok... redhat enterprise linux 5 shows that problem
[18:25] private_meta	using the latest zmq version
[18:26] private_meta	and gcc 4.4
[18:27] private_meta	Redhat is such a piece of shit, sorry for saying that
[18:28] pieterh	private_meta: yeah, they are, but what's that got to do with their software?
[18:28] pieterh	did you try on another Linux and it works?
[18:29] private_meta	They manage to break the simplest stuff in their libraries, their repositories are ancient
[18:31] taotetek	private_meta: redhat enterprise is simply older, as far as what versions of various packages you'll have.
[18:31] taotetek	private_meta: have you looked at 6 by the way?
[18:31] private_meta	No
[18:32] private_meta	And, in RHEL 5 they manage to break loads of standard libraries with their changes they make before putting them into their repositories
[18:32] private_meta	pieterh: It works on Ubuntu, I'd have to try other systems but I guess it works on them as well
[18:33] taotetek	don't know on that. we have centos 5 in production and haven't had issues outside of a couple of compilation issues with things that required newer autoconf / libtool etc
[18:34] private_meta	Do you know what might actually trigger that "interrupted system call" error in zmq?
[18:34] private_meta	'cause, apparently it does come from inside zmq
[18:34] pieterh	IMO it's a signal sent to zmq_poll, which doesn't particularly handle it.
[18:35] pieterh	presumably some versions of Linux send signals when you background a process
[18:35] pieterh	like SIGHUP or something
[18:35] private_meta	Well, ctrl+z DOES send a signal
[18:36] private_meta	Afaik it sends SIGTSTP
[18:36] private_meta	which means temporary stop
[18:39] pieterh	so in CZMQ I catch various signals which means zmq_poll won't return EINTR
[18:39] pieterh	actually it may still return EINTR, I forget
[18:40] private_meta	Well, still... apparently sending signals 18, 20 or 24 to the process, "SIGTSTP" creates this problem, but ONLY on redhet
[18:40] private_meta	*redhat
[18:42] pieterh	much as I'd like to blame rhat it's probably not their fault
[18:50] private_meta	pieterh: https://gist.github.com/8dbc7c8883b4d60144fa I know that is somewhat of an incomplete example, but it should show what I mean. When you press ctrl+z there, the error occurs on RHEL5
[18:52] pieterh	well, zmq_poll returning EINTR is not abnormal afaik, it happens in various situations
[18:52] private_meta	pieterh: In my opinion it shouldn't be normal behavior if it makes it terminate when trying to debug a program
[18:53] pieterh	private_meta: it only terminates because you don't handle the EINTR
[18:53] pieterh	there are a few threads on zeromq-dev about this, worth reading
[18:54] private_meta	pieterh: so who handles it for me on other OSes if I don't?
[18:54] pieterh	private_meta: different OSes (and kernels) send different signals
[18:56] private_meta	I'll check which signals are sent exactly
[19:01] private_meta	pieterh: do you know what signals zmq handles?
[19:02] pieterh	afaik it doesn't handle any signals, it's the poll system call (and other blocking calls) that exit with EINTR
[19:03] pieterh	so, for ctrl-C, czmq sets up a signal handler that flags the interrupt in a global variable
[19:03] pieterh	thus, when you get control back from zmq_poll you can test for this interrupt, and clean up properly if it hits
[19:04] pieterh	otherwise, ctrl-C will simply kill the process without a chance to clean up
[19:04] pieterh	now, for SIGTSTP, I'd guess this is rare enough no-one's hit the problem before
[19:05] pieterh	I'd say since you have to handle SIGINT anyhow, you can set-up a similar handler for SIGTSTP
[19:06] pieterh	I guess zmq_poll will still return with EINTR but at least you'll be able to handle the situation
[19:06] pieterh	this is the kind of stuff I'd expect a high-level C++ API to take care of
[19:06] private_meta	I can't imagine that SIGTSTP is rare
[19:09] pieterh	you may be able to just set up an ignore handler
[19:09] pieterh	but I suspect poll will still exit
[19:10] private_meta	I'm in contact with the #linux channel to check out what actually seems to transpire
[19:23] yukonbob	if I stress socket creation/deletion, I get Too many open files
[19:23] yukonbob	rc == 0 (mailbox.cpp:375)
[19:24] yukonbob	are these lazily collected, and I'm just pressing it's limits?
[19:24] yukonbob	, or do I have a flaw?
[19:33] yukonbob	... no indication of leaked memory on my part; sockets created via zmq_socket(), destroyed via zmq_close();
[19:49] pieterh	yukonbob: IMO Linux keeps sockets around for a while after your process closes them
[19:51] yukonbob	so, doesn't necessarilly indicate error w/ my use, nor zmq per se...
[19:51] pieterh	right
[19:52] pieterh	you can raise the number of files per process
[19:52] pieterh	but you can still hit that by opening/closing sockets rapidly
[19:52] pieterh	sure
[19:53] yukonbob	pieterh: cool. Thx for feedback.
[19:54] private_meta	pieterh: ok, I've pretty much assured myself that the signal sent is SIGTSTP, the same signal sent when debugging a program (and stepping) or pausing a program to move it to the background, and it definitely SHOULDN'T cause this program termination
[19:54] private_meta	pieterh: however, on RedHat it appears to do that
[19:54] pieterh	that sucks
[19:56] private_meta	And I also looked it up
[19:56] private_meta	it sends the same signal on Ubuntu as on RedHat
[19:56] private_meta	it's both SIGTSTP, it's both on ctrl+z or other program suspension functionality, but it just screws up rhel
[19:57] pieterh	I've not used RHAT since 1997 or so, and am quite happy to not have to use it
[19:58] private_meta	But imho that's still not a reason for it not to work on that operating system :/
[20:11] jond	private_meta: what does info signals show in gdb
[20:17] private_meta	jond: I can tell you tomorrow, after 13.5 hours today, I had to get home from work
[20:23] jond	private_meta: you might can use handle command to tell gdb to do different stuff, possibly handle SIGSTP nostop noprint pass would help I won't be around in the day (UK)
[20:23] jond	s/might//g
[20:24] private_meta	jond: I don't have access to the redhat server when I'm at home, at least not without LogMeIn, which I want to avoid
[20:25] jond	private_meta: no problems, worth a try though
[20:25] private_meta	I assume you mean SIGTSTP?
[20:25] jond	yep
[20:25] private_meta	what does "handle SIGTSTP nostop noprint pass" actually do?
[20:26] private_meta	i mean, it seems it tells gdb to handle "SIGTSTP" in the following three ways
[20:26] private_meta	but what do those count for?
[20:27] jond	it basically tells the debugger not to handle it, not to print it and to just pass it on.
[20:27] private_meta	so just to check if that makes the program continue as usual?
[20:28] jond	yes. I've had trouble with gdb debugging and signals before, but not SIGTSTP
[20:29] private_meta	Will you be online over the day?
[20:29] private_meta	(I know you're not here, just asking about being online)
[20:30] private_meta	just interested if a hilight with my results will be ok, or if i have to wait ;9
[20:30] jond	i don't know because I start a new position tomorrow and I am no clear what security is in place or if I have a desk even yet....
[20:30] private_meta	ah ok
[20:30] private_meta	well, my client is online somewhere else, always on, so I'm always unsure about others ;)
[20:31] jond	ok, we'll see tomorrow
[20:32] private_meta	indeed we will
[20:37] cwb	If I specify a pattern of message passing over 0MQ -- sending data of type A to ppull socket 1 will result in data of type B = f(A) being pushed to socket 2 and so on -- would you call that an API specification or a protocol?
[20:38] cwb	Are (well specified) 0MQ patterns generally protocols?
[20:42] cwb	I guess what I'm asking is: when should we say that a program that uses 0MQ for messaging has an API and when should we say that it implements a protocol? What's the distinction?
[20:46] private_meta	hmm... I guess many of the people who can help you with that are online during daytime (gmt/ust)
[20:46] private_meta	@ cwb
[20:52] pieterh	cwb: hi
[20:53] cwb	pieterh: hi!
[20:54] pieterh	strictly speaking, the API tells you how to speak to objects like contexts and sockets
[20:54] pieterh	whereas the protocols define how these talk to each other across some transport
[20:54] pieterh	there is some overlap in semantics
[20:55] pieterh	e.g. 'send' in the API can map to 'send' at the transport level
[20:55] pieterh	does this help?
[20:56] cwb	But the protocol send might be implemented with something like "snd" or require a few other implementation specific things?
[20:56] pieterh	let's say, the socket semantics might be implemented with various protocols
[20:57] cwb	Yep
[20:57] pieterh	e.g. if you're working over inproc, tcp, or pgm, a publisher send to subscribers will use different protocols
[20:57] pieterh	0MQ kind of pretends it has a single TCP protocol (for example) but that's a little inaccurate
[20:57] cwb	Ok..
[20:58] pieterh	in fact the socket semantics create variations of the TCP protocol at some level
[20:58] pieterh	this is not really documented in the formal protocol spec
[20:58] cwb	Yes, that makes sense.
[20:58] cwb	Ok.
[20:59] cwb	Actually, my question was more about things that get implemented using 0MQ -- i.e. applications.
[20:59] pieterh	so there are two classes of application
[21:00] cwb	If my Python program uses 0MQ to talk to a C program -- would you say they are talking using APIs or a protocol?
[21:00] pieterh	those that use the API and those that use the protocol
[21:00] pieterh	your Python and C programs are using both the API and the protocol
[21:00] pieterh	there are some 0MQ stacks that don't use the same API at all
[21:01] cwb	Ok.
[21:01] pieterh	e.g. http://www.zeromq.org/bindings:bash
[21:01] cwb	There's a bash binding? That's cool.
[21:02] pieterh	Daniel Lundin made this, it just publishes a single message
[21:03] pieterh	its API is 'zmq_push'
[21:03] pieterh	and its protocol is http://rfc.zeromq.org/spec:13, same as libzmq
[21:04] cwb	Aha, that helps.
[21:04] cwb	Excellent, thanks!
[21:05] pieterh	app <==> API ---- protocol ---- API <==> app
[21:06] cwb	Perfect.
[21:19] whack	pieterh: haha, that 'bash' binding is quite funny
[21:20] pieterh	whack: yeah, it's pretty cool
[21:21] pieterh	It actually fits into twitter
[22:29] jsimmons	which of the github repos am I supposed to be using
[22:58] Seta00	jsimmons, zeromq2-1 or libzmq for 0MQ 3.0