Thursday, December 22, 2016

Sublime copying all text that match a pattern

I had a text file containing numbers in between text that I wanted to copy, move to a new file and format them.

Since the numbers were of of fixed length and at the same index I could have copied them using vertical select. The vertical select is tricky if the file is big and it wouldn't help if the pattern to looks for is different and can occur anywhere in the text.

I knew the first way was to write a regular expression for the pattern that I was looking for. I could highlight all the matches with my pattern. But now how do I copy it?

https://forum.sublimetext.com/t/copy-matching-lines/5877/2

I couldn't beleive it could be as simple as clicking Find All. The Find All gives a cursor to all the match with text selected. Amazing!!!

So Ctrl+c will copy all of them. Bravo!

I put the data on another file where I have to format them. This was easy. I used replace and group capture.

Find :      (455\d\d\d\d\d\d)
Replace : ($1),

The above will put all the matches within ( ... ),

Again I would have used Find All and used the cursor to modify all the words and put the brackets around them.

Thanks to sublime!!

Wednesday, December 21, 2016

Creating a dynamic table out of list of numbers to join


I had a list of ids and I wanted to check which rows are not present in the DB.

We can use 'in (1, 2, 3)' and that would say which ones are in, and 'not in' would return rows outside the ids I had.

So I wanted to make a table out of the ids and use left or right join.

I could not find any simple way of doing this and then the approach mentioned here http://stackoverflow.com/questions/8002178/how-to-select-ids-in-an-array-of-ids

select * from (
select TO_NUMBER(xt.column_value) id1 from
xmltable(
'123456,
123457,
123458,
123459') xt) a
left join my_table mt on mt.id = a.id1
where mt.id is null

Indeed I had 1000 of Ids in my list and not the above 4 :)

I choose left join as my_table had thousands of rows.

I really appreciate effort of anyone to answer on SO.

Tuesday, December 20, 2016

Query join with View - Oracle Performance

I had a slow running query that joined on two views, A and B. I had this wrong understanding that the query is slow as it prepares the whole view and then runs the query on the view. This was totally incorrect.

Consider view in a join as a sub-query. The optimizer would try to minimize the cost.In my case the join with view A performed better than the join with view B. Both had approximately same amount of data.

I went through the following articleshttps://blogs.oracle.com/optimizer/entry/optimizer_transformations_view_merging_part_1https://blogs.oracle.com/optimizer/entry/optimizer_transformations_view_merging_part_2https://blogs.oracle.com/optimizer/entry/basics_of_join_predicate_pushdown_in_oracle
And the explain plan proved it all.The join with view A was merged as index joins in the query with base table. The reason being view A had none of this (was a simple select with inner join)
  • UNION ALL/UNION view
  • Outer-joined view
  • Anti-joined view
  • Semi-joined view
  • DISTINCT view
  • GROUP-BY view

While View B had many outer joins and an outer join with another view, C. The outer joins on tables were moved to View pushed predicate.


The optimizer does a view merge for a view with outer join when it is possible to do so.

The join on view C was expensive as it was a HASH join.

Thursday, August 18, 2016

Avro - schema change compatibility

I was under the impression that I can add an optional field to Avro schema and consumers with old schema would still be able to read the message whether or not the newly added optional filed is set or not in the record.

So I added an optional field to an Avro schema, generated an Avro from it and I was able to read it from consumers using the old and new schema (this might not work if the schema changes are not at the end).

But this is not how it should work as the Avro schema is not forward compatible. And that is what happened and one of the consumers started to fail.

So I was surprised why my code was working and the other code (using older schema) was throwing errors when reading the avro created out of new schema. This is where it was failing


    BinaryDecoder binaryDecoder = DecoderFactory.get().binaryDecoder(is, null);
    while (!binaryDecoder.isEnd()) {
      DatumReader<MyClass> datumReader = new SpecificDatumReader< MyClass >(MyClass.class);
      datumReader.read(null, binaryDecoder); // failure here
    }

The difference was, I didn't had the while loop in my code, assuming the input stream represented a single record, while the other consumer was handling a batch of requests.

So the above code would read part of the avro byte array as per the old schema in the first iteration and the remaining bytes for the new field (even though it is not set) in the second iteration. Since the left bytes was not a complete record it was reporting errors.

Tuesday, March 29, 2016

Map.get - KeyNotFound error

It is interesting how different languages approach towards a given problem.

Let's have a look at how different languages react while trying to read a key from dictionary

Java

map.get('key_that_does_not_exist')

=> null

Ruby

map['key_that_does_not_exist']

=> nil 

Python

>> x['key_that_does_not_exist']
Traceback (most recent call last):
 File "<stdin>", line 1, in <module>
KeyError: 'key_that_does_not_exist'

>> x.get('key_that_does_not_exist)
=> None

Scala

val x = Map("a" -> "1”)
x("key_that_does_not_exist”) // Throws exception
x.get("key_that_does_not_exist”) // None