Author Topic: Generalizing programming language patterns in CodeBlocks  (Read 72841 times)

Offline stahta01

  • Lives here!
  • ****
  • Posts: 7594
    • My Best Post
Re: Generalizing programming language patterns in CodeBlocks
« Reply #30 on: November 10, 2013, 04:35:12 pm »
Nobody seems to be backing me up on this idea yet. Is there no good reason to be able to program in your own language? There is a reason that we have many languages on our globe, right? It creates boundaries between different cultures. Should we not reflect that in the business of C++ as well, or am I the only one thinking like that? Maybe, just maybe... just maybe, it would be better if I just go and do harakiri on myself, then I dont need to bother.  :P

The main problem is you NEVER really defined what you meant by language!

I can see the word Language to mean  "Programming Language" (C,C++,D, Pascal) or "Native Language" (English, French, or Spanish).

Tim S.
« Last Edit: November 10, 2013, 04:38:54 pm by stahta01 »
C Programmer working to learn more about C++ and Git.
On Windows 7 64 bit and Windows 10 64 bit.
--
When in doubt, read the CB WiKi FAQ. http://wiki.codeblocks.org

Offline Alpha

  • Developer
  • Lives here!
  • *****
  • Posts: 1513
Re: Generalizing programming language patterns in CodeBlocks
« Reply #31 on: November 10, 2013, 04:48:26 pm »
Nobody seems to be backing me up on this idea yet. Is there no good reason to be able to program in your own language?
I think it is that most of us reading this thread (or at least myself) do not understand what exactly your idea is trying to achieve.  (Also, posting binary-only files as an idea for an open source project is... unusual.)

If by language you mean "native language", the general consensus here is that if one localizes one's source/build process to anything other than English, one will achieve nothing but a headache when trying to search documentation/error messages.

Offline beqroson

  • Multiple posting newcomer
  • *
  • Posts: 63
Re: Generalizing programming language patterns in CodeBlocks
« Reply #32 on: November 10, 2013, 04:50:19 pm »
Instead of pseudo code, perhaps I can describe usage patterns that I personally would like to see.

Open a source code project downloaded from the net:
1. Open the project.
2. Select menu->languages->Italian, which immediately translates the source into the Italian language.

Sounds like a mixed C Processor and Poedit combined with Comment Translator.  


Compile the source project:
1. Press menu->Build, the Italian code is "JIT"-translated into code known by the current compiler, such as GCC.

Sounds like a C Processor.


Release a source header set:
1. Press menu->Export, the code is stripped off the Italian language and is left with only the compiler known language, such as C++.

Sounds like a Text Processor tool.

So far, I see no need for a CB Plugin; I suggest trying out the CB  Contrib ToolPlus Plugin and see if that will do all that you need.
If nothing else it might be quicker during development.

Tim S.



Thanks a lot, stahta01, I may have prematurely removed the idea to use other tools for the task, thinking it would be better it it was integrated. Then again, maybe not. Perhaps several tools is better since there are several tasks involved.

Offline beqroson

  • Multiple posting newcomer
  • *
  • Posts: 63
Re: Generalizing programming language patterns in CodeBlocks
« Reply #33 on: November 10, 2013, 04:55:51 pm »
Nobody seems to be backing me up on this idea yet. Is there no good reason to be able to program in your own language? There is a reason that we have many languages on our globe, right? It creates boundaries between different cultures. Should we not reflect that in the business of C++ as well, or am I the only one thinking like that? Maybe, just maybe... just maybe, it would be better if I just go and do harakiri on myself, then I dont need to bother.  :P

The main problem is you NEVER really defined what you meant by language!

I can see the word Language to mean  "Programming Language" (C,C++,D, Pascal) or "Native Language" (English, French, or Spanish).

Tim S.


Yes, my definition was that both the programming language and the native language can be one. Such as if I write wholly in English, ie "function DoSomething()" or in Italian, ie "funzione FareQualcosa()", then both the native and the programming sentences could be categorized as "Italian" language.

Now, in the world of programming, I was thinking that the translation can be only to exchange words one by one, straight.
« Last Edit: November 10, 2013, 05:14:32 pm by beqroson »

Offline beqroson

  • Multiple posting newcomer
  • *
  • Posts: 63
Re: Generalizing programming language patterns in CodeBlocks
« Reply #34 on: November 10, 2013, 05:03:58 pm »
Nobody seems to be backing me up on this idea yet. Is there no good reason to be able to program in your own language?
I think it is that most of us reading this thread (or at least myself) do not understand what exactly your idea is trying to achieve.  (Also, posting binary-only files as an idea for an open source project is... unusual.)

If by language you mean "native language", the general consensus here is that if one localizes one's source/build process to anything other than English, one will achieve nothing but a headache when trying to search documentation/error messages.

Yes, I agree to that, unless all the developers understand the same other language, which today may be more and more uncommon. That which I was trying to achieve was to connect the knowledge of programming with the bulk of knowledge that you possess in your native language. As such it would be more comfortable for the mind to program in your native language, especially when you strain the mind with more advanced thinking in algorithms and more. Using two languages for the process demands the mind to have two different language centers, which for the mind reduces the thinking process.

The headache that you are talking about is apparently both well known and real. I was figuring that if the source could be distributed as a common language, then that would be no problem.

I have not dealt with the language used in commentary though.

However, a scenario where we would only have one global language, such as English? I am very pessimistic about that. What I see is that a language tend to accumulate a certain set of traits. With only one language, then we would by that theory have only one set of traits at a global level. Such a reduction of the wealth of the human culture would be... simply unacceptable.

Offline beqroson

  • Multiple posting newcomer
  • *
  • Posts: 63
Re: Generalizing programming language patterns in CodeBlocks
« Reply #35 on: November 10, 2013, 07:43:33 pm »
If I create a tool based on a variant of UTF-8, that I call UTF-82. The only thing that differs between UTF-8 and UTF-82 is that the forbidden byte sequences C0-C1, F9-FF are accepted in the string. In order to scan and display an UTF-82 string is to just ignore those byte sequences. Would you include UTF-82 as a file format in CodeBlocks? That would be very helpful.

Update: Or, if I am supposed to include it myself? If I post patches, would you accept me doing this and have it added? Or is this the task of someone else? I do not know what rules apply here.
« Last Edit: November 10, 2013, 08:25:59 pm by beqroson »

Offline dmoore

  • Developer
  • Lives here!
  • *****
  • Posts: 1576
Re: Generalizing programming language patterns in CodeBlocks
« Reply #36 on: November 10, 2013, 08:33:13 pm »
As a native english speaker, it's hard for me to judge how much of an impediment not being able to work in ones native language is. I guess it must be some, but it's not like English == C++, and sometimes the English names for programming constructs are actually misleading (e.g. "Cookie").

It's not clear that modifying the IDE itself is the best way to achieve what you want. If I am reading correctly, I think what you have in mind is sort of a pre-processor that converts the users native language source to language that the compiler understands ("program langauge") using a dictionary that maps native "words" to their compile-able equivalent. You also need to be able to convert the compiler messages back to their native language equivalents. At least from the compiler perspective, the best way to do this might be a command line tool that wraps around the GCC toolchain that handles the necessary conversions. You could then add support for this in Code::Blocks (or any other IDE that has flexible build system) by calling your commandline tool instead of the regular GCC toolchain.

On the other hand, allowing the user to use their native language for code completion (and other things like class and project wizards) would require very significant changes to the IDE. The CC parser would need to convert the native language to program language when parsing, and tokens would need to be converted from program language to native language for display in the UI.

Offline beqroson

  • Multiple posting newcomer
  • *
  • Posts: 63
Re: Generalizing programming language patterns in CodeBlocks
« Reply #37 on: November 10, 2013, 08:55:48 pm »
As a native english speaker, it's hard for me to judge how much of an impediment not being able to work in ones native language is. I guess it must be some, but it's not like English == C++, and sometimes the English names for programming constructs are actually misleading (e.g. "Cookie").

The effect is not noticeable at first. After years of programming, at least for me, it was bugging me increasingly. Even though I am quite fluent in the English language, that still bugged the crap out of me. Maybe I am sensitive. But there are more than just the comfort in it. If you think about how many times you use each syntactic term in C++ when programming. Maybe you print in the term "if" thousands of times each project. The effect adds up. Even so if I would be using my native language, the equivalent "om" in Swedish, that would eventually bug the crap out of me as well. But with a translator, I can switch between two or more variants of that. Now, the most important thing is of course to have the first term in the native language. I also notice that when I use "if", I would prefer writing the comments in English. When I switch to "om", my mind prefer writing the comment in Swedish (my native language). This is because the lowest level of terms, such as "if" would be hard coded in my mind after years of use, and that switches over my mind to its respective language. Each switch to another language takes an additional toll on the mind. (Not huge, but enough to make the mind outside of the comfort zone). So, this is not just an idea. My whole mind actually crave to be able to write in my native language, such as coding. I could get by continue to use the "normal" English terms. But I would be disappointed and less vital in my work.

It's not clear that modifying the IDE itself is the best way to achieve what you want. If I am reading correctly, I think what you have in mind is sort of a pre-processor that converts the users native language source to language that the compiler understands ("program langauge") using a dictionary that maps native "words" to their compile-able equivalent. You also need to be able to convert the compiler messages back to their native language equivalents. At least from the compiler perspective, the best way to do this might be a command line tool that wraps around the GCC toolchain that handles the necessary conversions. You could then add support for this in Code::Blocks (or any other IDE that has flexible build system) by calling your commandline tool instead of the regular GCC toolchain.

A such wrapper may indeed be the best way to do it. If it is decided that such is the way to go, then I will do it. The problem may be that I need to write one such tool wrapper for each compiler. That can get nasty considering how variants of MinGW, MinGW64, and now TDM keeps emerging.

On the other hand, allowing the user to use their native language for code completion (and other things like class and project wizards) would require very significant changes to the IDE. The CC parser would need to convert the native language to program language when parsing, and tokens would need to be converted from program language to native language for display in the UI.

I was kind of hoping that there would be a solution that would not require huge changes to the IDE. Actually, I am already running a command line tool for all my programming tasks that already translates the code. And this works. Also it works for debugging.

The reason that I still am bugging you over this are first, I like to make this available to everyone as an option in the world of programming. Also, an additional bonus would be if the code completion would work with this, but for me personally, that is not necessary. I get by without the full code completion.

At least, I do not want to place any burden on you folks, so I hope a simple solution is possible without too much effort. That is why this needs to be discussed, so we can figure out if there is a solution that simple enough to implement in CodeBlocks. I think it could be a good reputation for CodeBlocks if it could brag about this feature. I am not sure, just intuitively I think it would be a winning concept.

Offline beqroson

  • Multiple posting newcomer
  • *
  • Posts: 63
Re: Generalizing programming language patterns in CodeBlocks
« Reply #38 on: November 11, 2013, 01:47:10 am »
If I create a tool based on a variant of UTF-8, that I call UTF-82. The only thing that differs between UTF-8 and UTF-82 is that the forbidden byte sequences C0-C1, F9-FF are accepted in the string. In order to scan and display an UTF-82 string is to just ignore those byte sequences. Would you include UTF-82 as a file format in CodeBlocks? That would be very helpful.

Update: Or, if I am supposed to include it myself? If I post patches, would you accept me doing this and have it added? Or is this the task of someone else? I do not know what rules apply here.

I am working on designing the console application to use xml files. Later on, when creating the plugin, the code can just be copied over, I hope. But there is a design issue that I am pondering. And that is whether to at all try to use the UTF82 for this. The good thing with UTF82 is that one can include control strings embedded in the string. Instead of creating a document format, the UTF82 could provide a standardized way of creating your own document format, but with the ground structure already in place such as the encoding of UTF. The down side would be the hassle to implement the possibility to open any UTF82 document for example in CodeBlocks. It would make a totally different IDE, but for what purpuse, just to add the translation capability. Hardly worth the effort. Then again, in order to get such a versatile translation mechanism, UTF82 looks like a creamy cake to me. What do you think, should I ignore worldly facts and implement it as UTF82?

Offline stahta01

  • Lives here!
  • ****
  • Posts: 7594
    • My Best Post
Re: Generalizing programming language patterns in CodeBlocks
« Reply #39 on: November 11, 2013, 10:01:07 am »
Code::Blocks uses a very slightly modified version of tinyXML in the CB SVN path of src/base/tinyxml

If UTF82 is NOT supported by tinyXML, you would have to write your own way of reading/writing XML.
Or, submit patches and have them accepted at tinyXML.
Edit2: The above is based on the idea you are making a CB Plugin or might in the future.

Edit: Have you ever built CB from SVN source? If not, I suggest doing so.

Tim S.
« Last Edit: November 11, 2013, 02:39:10 pm by stahta01 »
C Programmer working to learn more about C++ and Git.
On Windows 7 64 bit and Windows 10 64 bit.
--
When in doubt, read the CB WiKi FAQ. http://wiki.codeblocks.org

Offline beqroson

  • Multiple posting newcomer
  • *
  • Posts: 63
Re: Generalizing programming language patterns in CodeBlocks
« Reply #40 on: November 11, 2013, 03:37:05 pm »
Edit: Have you ever built CB from SVN source? If not, I suggest doing so.

Due to the fact that ollydbg only run 32-bit Windows, and the bug report I put on Berlios now is suspected to originate from 64-bit implementation, I am at the moment compiling from SVN. So, the suggestion is proceeded.

BTW, good point about the xml. If tinyXml does not support the UTF82, then you are right, it aggravates things considerably. I know eventually, to be usable, the UTF82 needs to be standardized. I expect the implementation of UTF82 to change several times before that happens, if it happens.

Offline Alpha

  • Developer
  • Lives here!
  • *****
  • Posts: 1513
Re: Generalizing programming language patterns in CodeBlocks
« Reply #41 on: November 11, 2013, 04:15:27 pm »
However, a scenario where we would only have one global language, such as English? I am very pessimistic about that. What I see is that a language tend to accumulate a certain set of traits. With only one language, then we would by that theory have only one set of traits at a global level. Such a reduction of the wealth of the human culture would be... simply unacceptable.
Perhaps I spoke with too many absolutes.  I was not arguing that English is a special language that everyone needs to learn.
However, for English based programming languages (specifically C/C++, which is Code::Blocks' main target), I consider using non-English identifier names and comments to be as confusing as mixing vocabulary from multiple languages to form a sentence.

That said, I do believe your idea has merit, as it sounds that your goal is lossless and bidirectional.

Offline beqroson

  • Multiple posting newcomer
  • *
  • Posts: 63
Re: Generalizing programming language patterns in CodeBlocks
« Reply #42 on: November 11, 2013, 05:05:11 pm »
as it sounds that your goal is lossless and bidirectional.

Yep, that is two aspects of the goal.

Offline beqroson

  • Multiple posting newcomer
  • *
  • Posts: 63
Re: Generalizing programming language patterns in CodeBlocks
« Reply #43 on: November 11, 2013, 05:18:35 pm »
I consider using non-English identifier names and comments to be as confusing as mixing vocabulary from multiple languages to form a sentence.

I believe that identifier names is fixable with an advanced translator mechanism. That can be a good win for the programmer. However, it demands specifying two identifier names for each identifier. Comments in another deal altogether. I have no clue about how comments could be fixed. And I do not think that there is a solution for comments that is simple enough to make it worth any effort.

As for comments I have been pondering using a huge database with pretranslated strings. But that is also a hassle if you must assemble all comments using templated text strings. Besided it means you need to connect to the database, update the database and so forth. Too cumbersome.

Offline dmoore

  • Developer
  • Lives here!
  • *****
  • Posts: 1576
Re: Generalizing programming language patterns in CodeBlocks
« Reply #44 on: November 11, 2013, 06:22:47 pm »
I consider using non-English identifier names and comments to be as confusing as mixing vocabulary from multiple languages to form a sentence.

I believe that identifier names is fixable with an advanced translator mechanism. That can be a good win for the programmer. However, it demands specifying two identifier names for each identifier. Comments in another deal altogether. I have no clue about how comments could be fixed. And I do not think that there is a solution for comments that is simple enough to make it worth any effort.

As for comments I have been pondering using a huge database with pretranslated strings. But that is also a hassle if you must assemble all comments using templated text strings. Besided it means you need to connect to the database, update the database and so forth. Too cumbersome.

With both this stuff about comments and your UTF-82 talk, I think you are WAY overcomplicating things.

To me, the potential "win" here is to create set of standardized translation tables that translates programming language keywords and all of the exportable user defined tokens of the libs you care about (i.e. public classes, functions and variables of the libraries the user would use) to and from their foreign language equivalent. Comments, especially the doc strings for toolkits like wxWidgets, would be nice, but they aren't necessary to get a program to compile and dealing with them in the right way has to be part of a much larger translation effort.

To reiterate, you don't really need to integrate this into C::B to make your proof of concept. And you shouldn't because if it is useful to C::B users it will be useful to programmers more genrally. Why don't you start by writing a simple tool that takes the users foreign language source files (UTF-8), a speficified translation table, and outputs the english programming language equivalent (and vice versa). From there it would be easy enough to integrate into the GCC and other toolchains. Then turn it into a Library and IDEs will be able to take advantage of it too.