Friday, December 18, 2015

[IDL 2 CPP] Let Bison/Flex Generate C++ Parser

Although C++ isn't a very modern language, C could make you upset even more. I'm developing a translator which could translate IDL (Interactive Data Language) to C++, and had enough of the ancient C because I need various complex data struct to hold all kind of information that needed when doing the translation. So it's absolutely a good idea that let Bison/Flex to generate a C++ parser. But the way of enabling C++ for Bison/Flex is hard indeed. Bison/Flex lack documents that tells you how to develop a complete C++ parser. When you turn on the C++ option for Bison/Flex, tremendous compiling errors are waiting for you. I read so many articles, and want to say thank you to all these authors. And I will share you the whole cook book of letting Bisong/Flex generating C++ Parser.

Download Win Flex Bison
I tried two windows versions. This is the good one to support C++ parser.
http://sourceforge.net/projects/winflexbison/

Option for Bison
These two options tell Bison to generate a C++ parser. The file lalr1.cc is in the win_flex_bison downloaded package.

%language "c++"
%skeleton "lalr1.cc"

Setup Parser's Class Name
The following option give a name to the parser class.

%define parser_class_name "idl_parser"

The parser class would look like as following

  /// A Bison parser.
  class idl_parser
  {

Pass Essential Arguments to the Parser
You have to pass scanner which would provide yylex, to the parser. The parser could call yylex and obtain a token each time. "translator" could hold some important data, but depends on your needs. "arg_yyin" is the input source code file.

%parse-param {yy::IdlTranslator& translator}
%parse-param {yy::IdlScanner& scanner}
%parse-param {std::ifstream* arg_yyin}
%parse-param {std::ofstream* outputStream}

The parser ctor looks like as following.

idl_parser (yy::IdlTranslator& translator_yyarg, yy::IdlScanner& scanner_yyarg, std::ifstream* arg_yyin_yyarg, std::ofstream* outputStream_yyarg);

All arguments passed in, the parser will store them in member variables.

    /* User arguments.  */
    yy::IdlTranslator& translator;
    yy::IdlScanner& scanner;
    std::ifstream* arg_yyin;
    std::ofstream* outputStream;

Tell the Parser the Prototype of the yylex
"yylval" is used to return token string from scanner to parser. "yylloc" is an argument which could tell the current location in the source file. This parameter is enabled by %locations . If you didn't use %locations, then there would be no yylloc.

The following macro definition usually be defined in scanner header file. And included by bison .y file and flex .l file.

#ifndef YY_DECL
# define YY_DECL                                        \
    yy::idl_parser::token_type                         \
    yy::IdlScanner::lex(yy::idl_parser::semantic_type* yylval, yy::idl_parser::location_type *yylloc)

And in your grammar file (.y) define yylex macro.

#define yylex scanner.lex

The parser will call lex as following

YYCDEBUG << "Reading a token: ";
yychar = yylex (&yylval, &yylloc);

By defining the yylex macro, parser actually calls scanner.lex().

Implementing the Scanner
Pay attention to the first part of the following code please. It redefines yyFlexLexer, so the class yyFlexLexer in file "FlexLexer.h" becomes IdlFlexLexer. It very important, it avoids many conflicts. But it's really a poor technique.

The following is part of the scanner's header file.

#ifndef __FLEX_LEXER_H
#define yyFlexLexer IdlFlexLexer
#include "FlexLexer.h"
#undef yyFlexLexer
#endif

#include "idl.tab.hh"


namespace yy
{
    class IdlScanner : public IdlFlexLexer
    {
    public:
        IdlScanner(std::ifstream* arg_yyin);

        virtual ~IdlScanner();

        virtual idl_parser::token_type lex(
            idl_parser::semantic_type* yylval,
            yy::idl_parser::location_type *yylloc
            );
    };
}

Have a Look at the Implementation of IdlScanner::lex()
The file lex.???.cc is the lexer implementation file. The file is generated by flex. In this file, the macro YY_DECL just defined before, is used here for the implementation code.

YY_DECL
{
register yy_state_type yy_current_state;
register char *yy_cp, *yy_bp;
register int yy_act;
...

How does the Input File Passed into Scanner
The scanner class is derived from class yyFlexLexer which is defined in FlexLexer.h. The input file "std::ifstream* in" is actually passed to the parent class via ctor. Remember yyFlexLexer was defined as IdlFlexLexer.

    IdlScanner::IdlScanner(std::ifstream* in)
        : IdlFlexLexer(in)
    {
    }

The implementation of yyFlexLexer ctor is in file lex.???.cc, which is generated by flex. The input file arg_yyin is stored in yyin, a member variable of yyFlexLexer. yyin is used to be a global variable in C lexer version. If yyin is NULL, then the lexer would use stdin as input.

yyFlexLexer::yyFlexLexer( std::istream* arg_yyin, std::ostream* arg_yyout )
{
yyin = arg_yyin;
yyout = arg_yyout;
yy_c_buf_p = 0;
yy_init = 0;
yy_start = 0;
yy_flex_debug = 0;
yylineno = 1; // this will only get updated if %option yylineno

Invoke the Parser
First create the scanner with the input file, and then pass the scanner to the parser. When you call parser.parse(), the parser will call scanner.lex() to get tokens one by one.

        std::ifstream *inputFile = new std::ifstream();
        inputFile->open("some source file here");

        std::ofstream *outputFile = new std::ofstream();
        outputFile->open("some output file here");

        IdlScanner scanner(inputFile);

        idl_parser parser(*this, scanner, inputFile, outputFile);

        int parse_ret = parser.parse();



Sunday, November 29, 2015

[IDL 2 CPP] Implement Variable Definitions

Since IDL (Interactive Data Language) is a kind of dynamic typed language (dynamic language for short), it doesn't have variable declarations. So we have to add declarations in translated C++ source code.

The basic method of adding declarations is by using mid-rules in Bison grammar file.

As following, initialize a data structure that could store variable names, and then each time the parse meets an assignment statement, store the variable name for future use. The following code show the way of recording variable names. But its not accurate, because unary_expression doesn't equal to variable name, sometimes it's an element of an array or some other ting. So here we need more complex code logic to accomplish it. But I won't put these code here because here is just a show of high level of structure of the method.

assignment_statement:
unary_expression '=' expression
{printf("bison got assign statement: %s = %s\n", $1, $3);
VariableNameSet_TryAdd($1);
$$ = AllocBuff();
sprintf_s($$, TEXT_BUFFER_LEN, "%s = %s;", $1, $3);
}
;

And at last, when composing the C++ function, put the variable declarations at the beginning of the function.

function_definition:
KEY_FUNCTION identifier parameter_list_line
{
VariableNameSet_Init();
}
statement_list KEY_END end_of_line
{
char *buf = AllocBuff();
IncreaseTab($5, buf);

$$ = AllocBuff();

char *var_decl_buf = AllocBuff();
VariableNameSet_DecleareVariables(var_decl_buf);

char *var_decl_buf_tabbed = AllocBuff();
IncreaseTab(var_decl_buf, var_decl_buf_tabbed);

sprintf_s($$, TEXT_BUFFER_LEN, "Variant %s%s{\n%s%s}%s\n", $2, $3, var_decl_buf_tabbed, buf, $7);
printf("IDL function: [%s]\n", $$);
}
;




Saturday, November 28, 2015

[IDL 2 CPP] Translate IDL Array Subscript Ranges to C++

I started a project that translates huge amount of IDL codes to C++ for sake of performance improvements. IDL is acronym for Interactive Data Language. This is IDL's homepage. And here is Wikipedia page.

Since the original data processing program which was written in IDL is so large, it's impossible to translate manually, I decide to develop a translator then translate it into C++ source code. I've worked on it for a couple of weeks. And today I finished the ranged array part. I think it's interesting so I'm going to share the development experience for you.

I am using flex/bison to generate a parser. The grammar rules for IDL's array and subscripts are here(ask me for sample code):

postfix_expression:
primary_expression
| postfix_expression '[' array_subscripts ']'

array_subscripts:
array_subscript
| array_subscripts ',' array_subscript
;

array_subscript:
expression
| range_or_whole_range ':' range_or_whole_range
;

range_or_whole_range:
expression
| '*'
;

No action codes here yet. These grammar rules cover simple array subscripts which will visit an element of an array, and also the subscript ranges case. For the simple case, the IDL code is easy to translate. For example, IDL code:

a = b[1]

which would be translated to

a = b[1];

Just exactly the same code in C++. But it would be more complex if the code uses ranged array, like this:

a = b[1:10]

Since there is no relevant grammar in C++, we have to do some fundamental support in C++. Firstly, there should be an object which could represent a range of an array. Say, we have this:

class ArrayDimDesc
{
int size;
int startIndex;
int endIndex;
};

class RangedArray
{
void* array;
int Dimensions;
ArrayDimDesc arrayDimDesc[8];
};

So IDL ranged array b[1:10] could be represented by class RangedArray object. This way made things easier. Replace b[1:10] by a function call to MakeArrayRange1D(), the whole statement can be translated just as simple array subscript case.

Let me show you a more complex IDL code:

coeffbk = where(sset.bkmask[nord:*, 1, 2:3] NE 0)

And the translated C++ code is:

coeffbk = where(MakeArrayRange3D(sset.bkmask, nord, -2, 1, -1, 2, 3) != 0);

I haven't finished all the work. But the main idea is just as above. There are lots of further work to do:
1, Memory management
2, Introduce smart pointer?
3, Is it necessary to move MakeArrayRange3D in front of the statement?



==== BISON grammar rules for IDL ranged array ======================
postfix_expression:
primary_expression
| postfix_expression '[' array_subscripts ']'
{
$$ = AllocBuff();

// check if array_subscripts is a range
if(IsRangedSubscripts($3))
{
FillMinusOneToEmptyRange($3);
// case 1: range, compose a ranged array; $3 is subscripts
// MakeArrayRange(array, startIndex, endIndex)
if($3->dimension == 1)
{
// 1 dim
sprintf_s($$, TEXT_BUFFER_LEN, "MakeArrayRange1D(%s, %s, %s)",
$1,
$3->subscriptsRange[0].rangeStart,
$3->subscriptsRange[0].rangeEnd);
}
else if($3->dimension == 2)
{
sprintf_s($$, TEXT_BUFFER_LEN, "MakeArrayRange2D(%s, %s, %s, %s, %s)",
$1,
$3->subscriptsRange[0].rangeStart,
$3->subscriptsRange[0].rangeEnd,
$3->subscriptsRange[1].rangeStart,
$3->subscriptsRange[1].rangeEnd);
}
else if($3->dimension == 3)
{
sprintf_s($$, TEXT_BUFFER_LEN, "MakeArrayRange3D(%s, %s, %s, %s, %s, %s, %s)",
$1,
$3->subscriptsRange[0].rangeStart,
$3->subscriptsRange[0].rangeEnd,
$3->subscriptsRange[1].rangeStart,
$3->subscriptsRange[1].rangeEnd,
$3->subscriptsRange[2].rangeStart,
$3->subscriptsRange[2].rangeEnd);
}
else
{
// unsupport dimension
yyerror("Unsupported array dimension in ranged array, dimension is %d.\n", $3->dimension);
YYABORT;
}
}
else
{
// case 2: scalar, simple and easy case
char *subscript_buf = AllocBuff();
subscript_buf[0] = 0;
int pos = 0;
for(int i=0; i<$3->dimension; i++)
{
sprintf_s((subscript_buf+pos), TEXT_BUFFER_LEN, "[%s]", $3->subscriptsRange[i].rangeStart);
pos = strlen(subscript_buf);
}

// ##ATTENTION :IDL array subscripts are different than those of C++'s##
sprintf_s($$, TEXT_BUFFER_LEN, "%s%s",
$1,
subscript_buf);

// release subscript_buf
}

printf("array final code: %s\n", $$);
}
| postfix_expression '.' IDENTIFIER
{
$$ = AllocBuff();
sprintf_s($$, TEXT_BUFFER_LEN, "%s.%s", $1, $3);
}
| function_call
;

array_subscripts:
array_subscript
{
$$ = Malloc(sizeof(struct ArraySubscripts));
$$->dimension = 1;
strcpy($$->subscriptsRange[0].rangeStart, $1->rangeStart);
strcpy($$->subscriptsRange[0].rangeEnd, $1->rangeEnd);

// release $1
}
| array_subscripts ',' array_subscript
{
strcpy($$->subscriptsRange[$$->dimension].rangeStart, $3->rangeStart);
strcpy($$->subscriptsRange[$$->dimension].rangeEnd, $3->rangeEnd);
$$->dimension++;
}
;

array_subscript:
expression
{// single subscript
struct ArraySubscript *pArraySubscript = Malloc(sizeof(struct ArraySubscript));
strcpy(pArraySubscript->rangeStart, $1);
pArraySubscript->rangeEnd[0] = NULL;
$$ = pArraySubscript;
}
| range_or_whole_range ':' range_or_whole_range
{// Ranged subscript
struct ArraySubscript *pArraySubscript = Malloc(sizeof(struct ArraySubscript));
strcpy(pArraySubscript->rangeStart, $1);
strcpy(pArraySubscript->rangeEnd, $3);
$$ = pArraySubscript;
}
;

range_or_whole_range:
expression
| '*' { $$ = "*"; }
;

Wednesday, March 11, 2015

Optimizing by ANSYS - Objective Function Defined by Other Software

Key Points:

  1. You have an objective function to be optimized (to find a minimum/maximum point)
  2. But the function is defined by a software other than ANSYS
  3. The function doesn't have a mathematical formula


Suppose you have an objective function which is defined by a complex software, say an FEM or CFD software. The function has one or several variables, and the function generates different values regarding different variable values input.

Basic idea of the solution is by using /sys command in ANSYS APDL code which let ANSYS to invoke your software, and evaluate the function. ANSYS will invoke your software repeatedly until it finds a minimum/maximum point.

Key points for data transfering from ANSYS to your software and vice versa:
1, Your software that defining the objective function should parameterizing the model (the function), so that it can accept one or more parameters and generates the function value.
2, Your software should have a way of passing value to parameter(s), such as command line or an input file.
3, Your software should have a way of passing back the function value, such as program exit code or an output file.

Here is an example of the solution with the way of passing value that passes data by input/output file.

TASCA is the software which defines the objective function. "tascain.txt" is the input file, and tascout.txt is the output file.

vwrite puts the value of X to the input file. Then the code invokes TASCA. TASCA will read the data from "tascain.txt", and evaluate the function, and then writes the function value to "tascaout.txt". The function value will be read from the file and stores in A1 and V.

FILE: volu.inp
*DIM, A1, array, 2,1

X=1.4

*cfopen, tascain,txt
*vwrite, X
(F10.2)
*cfclose

/sys, TASCA evaluate

*vread, A1(1,1), tascaout,txt, c:\working, JIK, 1,1
(F6.2)

V=A1(1,1)


The following shows the optimization control file. V defined as the objective function. The makes ANSYS to find the minimum value of the objective function. To start optimization, type these command in the command input box of the ANSYS: /input, optovlu,inp .

FILE: optvolu.inp
/clear,nostart
/input,volu,inp
/opt
opanl,volu,inp
opvar,X,dv,1.3,1.9,1e-2
opvar,V,obj,,,1e-2
opkeep,on
optype,subp
opsave,optvolu,opt0
opexec





Friday, March 6, 2015

Loading Resources From Assembly and Assign to An Element Programatically

Suppose that you have resource in your assembly:

 <Style x:Key="TextBlock_left" TargetType="{x:Type TextBlock}">  
   <Setter Property="HorizontalAlignment" Value="Left" />  
   <Setter Property="VerticalAlignment" Value="Center" />  
   <Setter Property="Margin" Value="5" />  
   <Setter Property="TextWrapping" Value="Wrap" />  
 </Style>  

And then you want to assign this style to an element:
DataGridTextColumn.ElementStyle = (Style)FindResource("TextBlock_right");

That's it!

Tuesday, January 6, 2015

Free Charting Tool For WPF C#

"Dynamic Data Display" is an extremely powerful, userful and free charting tool for WPF and C#. If you are looking for a charting tool, and upset with commercial charting tool and library all over the world, then Dynamic Data Display is the best for you.

http://www.mesta-automation.com/real-time-line-charts-with-wpf-and-dynamic-data-display/

Tuesday, December 3, 2013

Simplify Asynchronous Programming with C# "yield return"

A few words ahead, please see http://msdn.microsoft.com/en-us/magazine/cc163323.aspx, for the article by inventor of this technology. My post is just a reinvent of the Jeffrey's idea. But my implementation is easy to understand. I read the code by Jeffrey, it's large and hard to digest.

(Next article: Nested IEnumerable functions.)

OK, here we started with my understandable implementation.

Asynchronous programming is extremely important and useful when you are developing an internet server which is going to support large concurrent requests. When dealing with internet data transmission, blocking mode with one thread handling one request is the one of the most popular traditional technologies. Programmers like it because it's easy to implement. On par with traditional programming technologies, asynchronous programming has many advantages.

  1. Far less threads involved, reduce large amount of memory usage.
  2. Since you don't have to create so many threads, its really responsive to handle an incoming request.
Though it's hard to implement, of course, and it's really easy to introduce bugs. Asynchronous calls always appear in Begin/End functions pair, for example, HttpWebRequest.BeginGetResponse() and  HttpWebRequest.EndGetResponse(). You have to call the BeginGetResponse() first, passing in a call back function, so that when it could be called and you got notified when BeginGetResponse() succeeded with a real response object. If your code flow involves several asynchronous calls, then your code is extremely hard to understand and maintain, since your code logic would be teared into several functions.

So, how to make your code stay in beautiful, logical, human readable form of linear structure, avoiding of tearing it into pieces, but still you could enjoy the high performance asynchronous programming technology? The answer is "yield return".

The basic idea of yield return is to form a state machine by writing the state changing logic into a single function or nested function calls. The coding style looks like the code of traditional technology of blocking mode data transmission. Every yield return will return from the function with all state reserved, and when last time the function being called, the code will be executed from last yield return statement.

See this simple example:

        private IEnumerable<int> Foo()
        {
            // some code here
            yield return 1;

            // some code here
            yield return 2;

            // some code here
            yield return 0;

            // some code here
        }


Foo() is a function, also it's a state machine. Every time you call it the function will resume execution at the point that last time returned. For sure that every local variable remains in its state of last yield return. OK, let's see how to execute this function:

        private void Driver()
        {
            foreach (var ele in Foo())
            {
                // do something
            }
        }

It's time to show the core of the technology. Let's incorporate the state machine function with asynchronous function calls.

        private IEnumerable<int> Foo()
        {
            var ar = HttpWebRequest.BeginGetResponse(callbackHandler);
            yield return 1;

            var response = HttpWebRequest.EndGetResponse();
            response.DoSomthing();
        }

The code isn't real working code, just for theory illustration. The first statement calls BeginGetResponse() passing a callback handler in. The callback handler won't be called until BeginGetResponse() finishing the work. During this period of time, no thread of your app work for these stuffs. So you don't have to pay the price to wait for BeginGetResponse() to return, for example, a thread waiting by semaphore or event. Thus we have a chance to make Foo() to execute the rest of the code. But Driver() can't drive the execution for us. So we have to upgrade our Driver() to a real driver which can drive Foo() through all of its async steps.

        public class AsyncDriver
        {
            public IEnumerator<int> iterator;

            public void AsyncCallback(IAsyncResult ar)
            {
                DriveToNext();
            }

            private void DriveToNext()
            {
                iterator.MoveNext();
            }

            public void AsyncCallback(IAsyncResult ar)
            {
                DriveToNext();
            }
        }

        public void Main()
        {
            AsyncDriver asyncDriver = new AsyncDriver();
            asyncDriver.iterator = Foo().GetEnumerator();
        }

        private IEnumerable<int> Foo(AsyncDriver asyncDriver)
        {
            var ar = HttpWebRequest.BeginGetResponse(asyncDriver.AsyncCallback);
            yield return 1;

            var response = HttpWebRequest.EndGetResponse();
            response.DoSomthing();
        }

So we have AsyncDriver class. Pay attention to BeginGetResponse(), as you can see the call back function is offered by AsyncDriver. When BeginGetResponse() finished its work, AsyncDriver will take control, and it drive the async steps in Foo() by calling iterator.MoveNext(). The 'iterator' is from Foo().

That explains the theory of driving async steps of an IEnumerable function. Real code would be more complex.

Here is the complete code of AsyncDriver. And follow that is the usage of AsyncDriver.

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading;

namespace Common
{
    public class AsyncDriver
    {
        private class AsyncDriverResult : IAsyncResult
        {
            private object asyncState;

            public object AsyncState
            {
                get { return asyncState; }
                set { asyncState = value; }
            }

            public WaitHandle AsyncWaitHandle
            {
                get { throw new NotImplementedException(); }
            }

            public bool CompletedSynchronously
            {
                get { throw new NotImplementedException(); }
            }

            private bool isCompleted;

            public bool IsCompleted
            {
                get { return isCompleted; }
                set { isCompleted = value; }
            }
        }

        private AutoResetEvent executionFinishedEvent;
        private Mutex mutex;
        private AsyncDriverResult asyncDriverResult;

        private AsyncCallback AsyncDriverCallback;
        private Exception terminationException;

        private IEnumerable<int> enumerable;

        private IEnumerator<int> iterator;

        public delegate IEnumerable<int> AsyncEnumerator(AsyncDriver asyncDriver, object state);

        public AsyncDriver()
        {
            mutex = new Mutex();
        }

        public void AsyncCallback(IAsyncResult ar)
        {
            DriveToNext();
        }

        private void DriveToNext()
        {
            ThreadPool.QueueUserWorkItem((aeObject) =>
            {
                mutex.WaitOne();

                try
                {
                    bool tonext = iterator.MoveNext();
                    if (tonext)
                    {
                        if (iterator.Current == 0)
                        {
                            OnExecutionFinished();
                        }
                    }
                    else
                    {
                        OnExecutionFinished();
                    }
                }
                catch (Exception ex)
                {
                    terminationException = ex;
                    ex.Data.Add("PtolemaicCallStack", ex.StackTrace);
                    OnExecutionFinished();
                }
                finally
                {
                    mutex.ReleaseMutex();
                }
            },
            this);
        }

        private void OnExecutionFinished()
        {
            iterator.Dispose();
            executionFinishedEvent.Set();
            AsyncDriverCallback(asyncDriverResult);
        }

        public IAsyncResult BeginExecute(AsyncEnumerator asyncEnumerator, AsyncCallback callback, object state)
        {
            asyncDriverResult = new AsyncDriverResult();
            asyncDriverResult.AsyncState = state;

            enumerable = asyncEnumerator(this, state);
            AsyncDriverCallback = callback;
            Execute();
            return asyncDriverResult;
        }

        public void EndExecute(IAsyncResult ar)
        {
            executionFinishedEvent.WaitOne();
            if (terminationException != null)
            {
                throw terminationException;
            }
        }

        public void Execute()
        {
            executionFinishedEvent = new AutoResetEvent(false);
            iterator = enumerable.GetEnumerator();
            DriveToNext();
        }

        public static string DumpException(Exception ex)
        {
            string msg = "\n=================================================================\n"
                + ex.ToString() + "\n"
                + ex.Message + "\n\n" + ex.StackTrace + "\n";

            if (ex.Data.Contains("PtolemaicCallStack"))
            {
                msg += "\n---- PtolemaicCallStack ----\n" + ex.Data["PtolemaicCallStack"];
            }

            msg += "\n=================================================================\n";

            return msg;
        }
    }
}


The usage of AsyncDriver:

        public void Main()
        {
            Common.AsyncDriver asyncDriver = new Common.AsyncDriver();
            var serverContext = new ServerContext();
            serverContext.AsyncDriver = asyncDriver;
            asyncDriver.BeginExecute(Foo, FooCallback, asyncDriver);
        }

        private void FooCallback(IAsyncResult ar)
        {
            var asyncDriver = ar.AsyncState as Common.AsyncDriver;
            asyncDriver.EndExecute(ar);
        }

        private IEnumerable<int> Foo(Common.AsyncDriver asyncDriver, object state)
        {
            var ar = HttpWebRequest.BeginGetResponse(asyncDriver.AsyncCallback);
            yield return 1;

            var response = HttpWebRequest.EndGetResponse();
            response.DoSomthing();
        }