Sunday, August 20, 2006

week of work

This last week was pretty uneventful. I worked. There was a DAS sprint last week and while I wasn't in California for it I did participate from here. It started in the evening with a 6pm phone conference call. For the first few days I went to a cyber cafe nearby. I figured I would talk and use their bandwidth instead of the DSL here. There's usually a 3GB/month cap and I have no idea how much of it I've used. I've been putting off downloading movies and large images, cvs updates, things like that. The noise was too loud though, being on Main street. For the last couple of days I did the call from here. Last January when I did a conference call I used a phone card but there's no land line in this apartment so VOIP for me.

For the sprint I updated the spec and implemented a reference server. I've used the latter as a practice piece for learning more about TurboGears, SQLObject, SQLAlchemy, and other bits of techonology. I understand a bit better why people have raved about SQLAlchemy. There's a few bits I'm still shaky about.

Usually I would work on the sprint until about 3 or 4 am, with breaks. I'm usually a go-to-bed 2am person. I've found that without an alarm I sleep about 8 hours. Waking up at noon seems like a waste of the day. When I've pushed my schedule that late I end up being groggy for the first few hours. And then it's time for another phone call.

The last day's (evening's) call was the strangest. Everyone else (US sprint; the UK people didn't participate in this one) wants a change to the basic feature data structure. It's strange because it makes no sense to me while to everyone else it's obvious. I'm the spec author so I'm the one that needs to be convinced. I also need to resolve this. My approach Friday evening (after the call) was to figure out a difference between the two.

I'm a protein guy by training. The examples bought up in the conference call to justify the reasonableness of the change were all DNA oriented. (The short version is: if a child feature has a location then the parent feature must have a single location on that segment and it must cover all the locations in the children.) My counter examples were all protein so I worked some trying to find DNA-based counter examples. I came up with a couple, but I know so little about DNA. I have to stretch back to '93 when I learned some of the basics of regulatory factors.

I think the reason for the difference in viewpoint is because DNA as a physical thing is very boring. And I say that with the highest respect; proteins get all the action while DNA mostly sits there. Only a small bit of the human genome even gets transcribed. Some into protein, hence the annotations are pretty indirect. Some proteins bind to DNA to promot or inhibit expression of certain genes so these are a bit exciting, but the binding sites are all small, contiguous regions. BLAST results have gaps but the region in the gaps is relevant so even there having a single covering location for the parent makes sense.

Compare that to a protein annotation like "catalytic triad" where a location for the parent element for the three features (assuming a feature per residue) makes no sense. Digging around I did find some annotation types where having a parent with a single location covering all of its children didn't make sense: D-loop in mitochondria, promotor groups (multiple promotor sites for a given gene), and RNA/ssDNA structure and catalytic function.

Today I tried another approach. Assuming the data structure is changed what are the consequences to the spec and does the result make sense. My conjecture is it's needed to make the "inside" search work correctly, but I think the solution doesn't interact well with the other query types. I also think the "inside" search isn't needed and a better solution to the use-case is a "but_not_overlaps". We'll work this out over the next couple of weeks.

Saturday afternoon I went to the salsa clinic. That's the monthly event at Que Pasa where John reviews the previous month's lessons over 90 minutes. I learned a few nice additions but if I don't practice them I'll forget them. Towards the end I danced some with .. I don't remember her name. She was quite good and fun to dance with. QP has more space than Buena Vista which means I can do things which take up space. At BV I often feel constrained in what I can do because it's so packed.

James and Amanda's house^H^H^H^H^Hflat warming party was last Friday evening. I snuck out for a few hours for that. I brought a couple of cans of Swedish cider as house warming gifts, which they enjoyed. Cider here is like British cider; dry and a bit. Bit. I don't know the right word. "flat", "muddy", "wooden" come mind. So does "slightly sour." Swedish cider is sweet and carbonated. Jim (Cooper) called it R-rated Kool-Aid.

2 comments:

Anonymous said...

Hey my computer pulled up your blog!!!!! Yeah! Although I didn't REALLY understand most of what you said in the paragraphs before "salsa clinic" it sounds like you are trying to resolve some work-related issues. Good luck with that! ::cyber hug::

I am glad you are at least enjoying yourself, even if you have to "sneak out" to do so. I like Jim's description of Swedish cider. You need to send me some! Just kidding. Alcohol is illegal on deployment.

Talk to you later.

Anonymous said...

I'm down with Jim's description of Swedish cider.