QlikChallenge #1: Getting the max value over multiple expressions

Some days ago I received a funny/tricky question:

Let’s assume you have the following data:

And you want to calculate the highest expression of Value_1, Value_2, etc. within your load script to get the following result:

How would you solve this?
Just use the comment section to post your ideas … I am looking forward to reviewing your solutions … :)

Regards
Stefan

P.S.: Ralph & Team: You are not allowed to participate in this competition because you have already received my solution for that question :)

Bookmark and Share

. Bookmark the permalink. Both comments and trackbacks are currently closed.

15 Comments

  1. Stefan Stoichev
    Posted April 15, 2011 at 06:57 | Permalink

    Hi Stefan,
    i had the same (but a little more complicated) requirement these days. My idea is in the script to get the values in separate table using CrossTable. Then calculate the max value per “case” (in another temp table), order this values and set RowNo() as sorting field. In final there is original table and sort table linked to original by “case” field.
    I’ve made a simple qvw file that works with data that you provide http://dl.dropbox.com/u/63800/sorting.qvw
    Hope that helps :-)

    Have a nice day!
    Stefan

    • Posted April 28, 2011 at 20:19 | Permalink

      Hi Stefan,

      nice approach, but regarding the customer’s requirements this does not exactly show the desired result.

      But thanks for your example and participation!
      Regards
      Stefan

  2. Nodier Torres
    Posted April 15, 2011 at 07:08 | Permalink

    Hi fun challenge, here is my solution:
    (The Mapping is a workaround in case the field names dont have numeric values, i couldnt use fieldnumber(Exp,’Data’) in the load statement)

    Data:
    LOAD * INLINE [
        Case, Value_1, Value_2, Value_3, Value_4
        aaa, 15, 35, 25, 25
        bbb, 50, 25, 10, 15
        ccc, 10, 5, 20, 65
    ];
    
    
    
    for i = 2 to NoOfFields('Data')
    temp:
    Load fieldname($(i),'Data') as fieldname,
    	$(i)-1 as fieldnumber
    autogenerate 1;
    next;
    
    fields:
    mapping load * resident temp;
    drop table temp;
    
    cross:
    crosstable(Exp,Value,1)
    LOAD  * RESIDENT Data;
    
    
    Inner join(cross) LOAD Case,  max(Value) as Value
    resident cross
    Group By Case; 
    
    
    join(Data) LOAD 
    Case,
     textbetween(Exp,'Value_',null()) as MG ,
    applymap('fields',Exp) as MaxGroup
    Resident cross;
    
    drop table cross;
    
  3. Posted April 15, 2011 at 16:02 | Permalink

    Good challenge, follow my solution. An simpler solution would be get the field name index (_1, _2, etc). But this cenarios would be limited.

    [Cases_Data]:
    LOAD * INLINE [
        Cases, Value_1, Value_2, Value_3, Value_4
        aaa, 15, 35, 25, 25
        bbb, 50, 25, 10, 15
        ccc, 10, 5, 20, 65
    ];
    
    FOR i = 2 TO NoOfFields('Cases_Data')
    
    	[FieldsIndex]:
    	LOAD 
    		FieldName($(i),'Cases_Data')	 as [AVal],
    		$(i)-1 as [FieldIndex]
    	AutoGenerate 1;
    
    NEXT;
    
    [Cases_Cross]:
    CrossTable(AVal, Data)
    LOAD Cases, 
         Value_1, 
         Value_2, 
         Value_3, 
         Value_4
    RESIDENT [Cases_Data];
    
    LEFT JOIN ([Cases_Cross])
    LOAD
    	[AVal],
    	[FieldIndex]
    RESIDENT [FieldsIndex];
    
    [Cases_Max]:
    LOAD
    	Cases,
    	Hash128(Cases,Max(Data)) as [%KeyIndex],
    	Max(Data) as [MaxValue]
    RESIDENT [Cases_Cross]
    GROUP BY Cases;
    
    LEFT JOIN ([Cases_Max])
    LOAD
    	Hash128(Cases,Data) as [%KeyIndex],
    	[FieldIndex] as [MaxGroup]
    RESIDENT [Cases_Cross]
    WHERE Exists([%KeyIndex],Hash128(Cases,Data));
    
    DROP TABLE FieldsIndex;
    
  4. Posted April 16, 2011 at 17:27 | Permalink

    … nice answers so far … I’ll post a comparison of all solutions (also mine) next week; in the meantime I’ll be waiting for further solutions … :)

    Regards
    Stefan

  5. Michael Laenen
    Posted April 17, 2011 at 22:14 | Permalink

    Hi Stefan,
    Here is my solution.
    Regards,
    Michael

    Data:
    Load * InLine [
    case,Value_1,Value_2,Value_3,Value_4
    aaa,15,35,25,25
    bbb,50,25,10,15
    ccc,10,5,20,65
    ];
    
    Tmp_cross:
    CrossTable (Value_Name, value, 1)
    load 
    case,Value_1,Value_2,Value_3,Value_4
    resident Data;
    
    left join (Data)
    load
    	case,
    	firstsortedvalue(Value_Name, -value) as Max_Value_Name
    resident Tmp_cross
    group by case;
    drop table Tmp_cross;
    
  6. Posted April 18, 2011 at 22:57 | Permalink

    Thanks for posting a fun challenge. Here’s my solution.

    Data:
    Load * InLine [
    Case,Value_1,Value_2,Value_3,Value_4
    aaa,15,35,25,25
    bbb,50,25,10,15
    ccc,10,5,20,65
    ];
    
    LEFT JOIN (Data)
    LOAD 
    	Case,
    	match(RangeMax(Value_1, Value_2, Value_3, Value_4),
    		Value_1, Value_2, Value_3, Value_4
    	) as MaxGroup
    RESIDENT Data;
    
  7. John Witherspoon
    Posted April 20, 2011 at 06:25 | Permalink

    I like Rob’s solution for this SPECIFIC problem, though I’d reduce it to this:

    LOAD *
    ,match(rangemax(Value_1,Value_2,Value_3,Value_4)
                   ,Value_1,Value_2,Value_3,Value_4) as MaxGroup
    INLINE [
    Case,Value_1,Value_2,Value_3,Value_4
    aaa,15,35,25,25
    bbb,50,25,10,15
    ccc,10,5,20,65
    ];
    

    I think the preceeding load would normally be faster to execute, and it’s certainly less script. However, I’d like to see a more general solution. Things to consider:

    – We shouldn’t rely on the pattern in the field names. I don’t think anyone has, and a couple people have basically stated this explicitly. I agree with them.

    – In some cases, we might want a field name instead of a number.

    – What should we do for cases where more than one field has the maximum value? Just take the first? If so, first in what sequence? Return all of them in a concatenated value?

    – What if there are other fields in the table besides the key and the values? Text fields, for instance, or a number that shouldn’t participate in the maximum? We would want to ignore those extra fields. I think this should be handled by specifying the list of fields we care about. And we should only have to give that list ONCE to avoid future maintenance problems as we add and remove fields from the list.

    So here are a couple of solutions. Both specify the fields to include in an inline load, with the field number assigned by the position in the list. The first solution uses this field number to return only the FIRST maximum field and field number. It’s pretty much Rob’s solution with the mentioned additions:

    Fields: 
    LOAD 
     concat(Field,',',Seq) as Fields
    ,concat(chr(39) & Field & chr(39),',',Seq) as FieldsQuoted
    ;
    LOAD *
    ,recno() as Seq 
    // Add all fields to be included to this list
    // List in the order you want selected if more than one = maximum
    INLINE [
    Field
    Something
    SomethingElse
    AField
    AnotherField
    ];
    LET vFields = peek('Fields');
    LET vFieldsQuoted = peek('FieldsQuoted');
    DROP TABLE Fields;
    
    Data:
    LOAD *
    ,pick(match(rangemax($(vFields)),$(vFields)),$(vFieldsQuoted)) as MaxName
    ,     match(rangemax($(vFields)),$(vFields)) as MaxGroup
    INLINE [
    Case,Something,SomethingElse,AField,AnotherField,DoNotUseThisField,OrThisOne
    aaa,15,35,25,35,45,A
    bbb,50,25,10,15,10,B
    ccc,10,5,20,65,5000,C
    ];
    

    The second solution returns a list of maximum fields and field numbers. It uses the field number to put the list of maximum field names in the desired order. I’m… not really satisfied with this one. Way too many joins for my taste. I’m betting it could be significantly simplified, but I really should do something else with my evening. :)

    Data:
    LOAD * INLINE [
    Case,Something,SomethingElse,AField,AnotherField,DoNotUseThisField,OrThisOne
    aaa,15,35,25,35,45,A
    bbb,50,25,10,15,10,B
    ccc,10,5,20,65,5000,C
    ];
    
    Fields:
    LOAD *, recno() as Seq 
    // Add all fields to be included to this list
    // List in the order you want shown if more than one = maximum
    INLINE [
    Field
    Something
    SomethingElse
    AField
    AnotherField
    ];
    FieldList: 
    LOAD concat(Field,',',Seq) as Fields
    RESIDENT Fields
    ;
    LET vFields = peek('Fields');
    DROP TABLE FieldList;
    
    Temp:
    CROSSTABLE (Field,Value)
    LOAD Case, $(vFields)
    RESIDENT Data
    ;
    LEFT JOIN (Temp)
    LOAD Field, Seq
    RESIDENT Fields
    ;
    LEFT JOIN (Temp)
    LOAD Case, $(vFields)
    RESIDENT Data
    ;
    LEFT JOIN (Data)
    LOAD Case
    ,concat(if(Value=rangemax($(vFields)),Field),',',Seq) as MaxFields
    ,concat(if(Value=rangemax($(vFields)),Seq  ),',',Seq) as MaxNumber
    RESIDENT Temp
    GROUP BY Case
    ;
    DROP TABLES Temp, Fields;
    
  8. John Witherspoon
    Posted April 21, 2011 at 02:50 | Permalink

    I can’t believe I didn’t see this before. All I needed to do was use a comma-delimited list of fields instead of a vertical list.

    LET vFields = 'Something,SomethingElse,AField,AnotherField';
    LET vQuoted = chr(39) & replace('$(vFields)',',',chr(39) & ',' & chr(39)) & chr(39);
    
    LOAD *
    ,pick(match(rangemax($(vFields)),$(vFields)),$(vFieldsQuoted)) as MaxName
    ,     match(rangemax($(vFields)),$(vFields)) as MaxGroup
    INLINE [
    Case,Something,SomethingElse,AField,AnotherField,DoNotUseThisField,OrThisOne
    aaa,15,35,25,35,45,A
    bbb,50,25,10,15,10,B
    ccc,10,5,20,65,5000,C
    ];
    
  9. Alessandro Furtado
    Posted May 5, 2011 at 00:10 | Permalink
    Let Min = 'rangemin(Value_1,Value_2,Value_3,Value_4)';
    Let Max = 'rangemax(Value_1,Value_2,Value_3,Value_4)';
    //
    LOAD
         Case, 
         Value_1, 
         Value_2, 
         Value_3, 
         Value_4,     
         if(Value_1=$(Min),1,if(Value_2=$(Min),2,if(Value_3=$(Min),3,4))) as Min,
         if(Value_1=$(Max),1,if(Value_2=$(Max),2,if(Value_3=$(Max),3,4))) as Max
    FROM Chalenger.xlsx (ooxml, embedded labels);
    //
    
    • Posted May 9, 2011 at 21:18 | Permalink

      Hi Allessandro,

      thank you for your solution … I hope I will have enough time this week to compare all these solutions :)

      Regards
      Stefan

  10. Posted April 12, 2012 at 17:37 | Permalink

    B:

    Crosstable(Column,Values,1)
    LOAD * INLINE [
    Data, Value1, Value2, Value3
    a, 25, 35, 56
    b, 12, 23, 54
    c, 11, 53, 12
    ];

    C:

    load Max(Values) as Maximum,Data Resident B group by Data;

    join

    LOAD * INLINE [
    Data, Value1, Value2, Value3
    a, 25, 35, 56
    b, 12, 23, 54
    c, 11, 53, 12
    ];

    drop table B;

  11. Posted April 16, 2012 at 08:39 | Permalink

    Last post wasn’t complete answer.

    B:

    Crosstable(Column,Values,1)
    LOAD * INLINE [
    Data, Value1, Value2, Value3
    a, 25, 35, 56
    b, 12, 23, 54
    c, 11, 53, 12
    ];

    C:

    load Max(Values) as Maximum,Data Resident B group by Data;

    join

    LOAD * INLINE [
    Data, Value1, Value2, Value3
    a, 25, 35, 56
    b, 12, 23, 54
    c, 11, 53, 12
    ];

    drop table B;
    FinalTable:
    Load pick(Maximum,Value1,Value2,Value3) as MaxGroup, * Resident C;

    Drop table C;

  12. TheWorldIsMine
    Posted November 14, 2012 at 23:59 | Permalink

    Look mine Solution:
    Dude!!

    http://dl.dropbox.com/u/24456540/Max_Expressions.QVW

    Tab_Header_XLS:
    NoConcatenate
    Load
    *
    From
    D:\Test\Expressions.TXT
    (txt, codepage is 1252, no labels, delimiter is ‘|’, msq)
    Where
    RecNo()<=1;

    Let vs_Fields_1 = NoOfFields('Tab_Header_XLS');

    Tab_Header:
    NoConcatenate
    Load * Inline
    [Field_Index,Field_Name];

    For vs_Index_1 = 1 to $(vs_Fields_1) Step 1

    Concatenate(Tab_Header)
    Load
    '@'&RowNo() as Field_Index
    ,Peek(FieldName($(vs_Index_1),'Tab_Header_XLS'),0,'Tab_Header_XLS') as Field_Name
    AutoGenerate(1);

    Next vs_Index_1

    Map_Name_Field:
    Mapping
    Load
    Distinct
    Field_Index
    ,Field_Name
    Resident
    Tab_Header;

    Tab_XLS:
    NoConcatenate
    Load
    *
    From
    D:\Test\Expressions.TXT
    (txt, codepage is 1252, no labels, delimiter is '|', msq);

    Tab_Espressioni:
    NoConcatenate
    Load * Inline
    [Index];

    Let vs_Records_1 = NoOfRows('Tab_XLS');

    For vs_Index_1 = 2 to $(vs_Fields_1) Step 1

    Concatenate(Tab_Espressioni)
    Load
    $(vs_Index_1) as Index
    ,Peek('@1',IterNo()-1,'Tab_XLS') as @1
    ,Peek('@$(vs_Index_1)',IterNo()-1,'Tab_XLS') as Value
    AutoGenerate(1)
    While
    IterNo()<=$(vs_Records_1);

    Next vs_Index_1

    Tab_Espressioni_Max:
    NoConcatenate
    Load
    @1
    ,Max(Value) as Max_Value
    Resident
    Tab_Espressioni
    Group By @1;

    Inner Join(Tab_Espressioni_Max)
    Load
    @1
    ,Value as Max_Value
    ,Index
    Resident
    Tab_Espressioni;

    Inner Join(Tab_XLS)
    Load
    @1
    ,ApplyMap('Map_Name_Field','@' & Index,'#') as Field_Name
    ,Index as Field_Position
    ,Max_Value
    Resident
    Tab_Espressioni_Max;

    Drop Tables Tab_Header_XLS
    ,Tab_Header
    ,Tab_Espressioni
    ,Tab_Espressioni_Max;

    Rename Fields using Map_Name_Field;

  13. johan
    Posted February 15, 2013 at 13:49 | Permalink

    A bit late for the competition. Anyway, I like to keep things simple:

    Data:
    load *, rangemax(Value_1,Value_2,Value_3,Value_4) as MaxVal;
    LOAD * INLINE [
    Case, Value_1, Value_2, Value_3, Value_4
    aaa, 15, 35, 25, 25
    bbb, 50, 25, 10, 15
    ccc, 10, 5, 20, 65
    ];

    DataSorted:
    load Case, RowNo() as MaxGroup
    Resident Data
    order by MaxVal desc;

One Trackback

  1. […] Getting the Max Value over Multiple Expressions via Qlikblog.at […]