Специальные символы

Материал из Egghelp.ru - TCL/TK Eggdrop Wiki
Версия от 07:55, 25 марта 2009; Deniska (обсуждение | вклад)

(разн.) ← Предыдущая | Текущая версия (разн.) | Следующая → (разн.)
Перейти к: навигация, поиск

Eggdrop scripts that choke Many eggdrop scripts choke on nicks, usernames, or text, that contain characters that have a special meaning in Tcl, especially [, ], {, }, ", \, and $. For that reason, some IRC channels ban nicks that contain [ or {.

Yet the problem can be avoided completely by writing correct Tcl code. There are two golden rules that must be followed.

The first golden rule of Tcl Include all of the splits and joins that are necessary to convert between lists and strings.

It is best to regard lists and strings as if they were two separate types, putting into our code all the splits and joins that are necessary to convert between them. If we do that, the interpreter will take the necessary actions with any special Tcl characters that may be present. We gain a big bonus that we do not need to understand how lists are represented (a topic that is not very straightforward when Tcl special characters are present) - instead, we simply let the interpreter take care of that aspect for us.

For example, one eggdrop script that I was given contained three lines similar to the following:

bind dcc o tell do_dcc_tell proc do_dcc_tell { hand idx arg } { set arg [lrange $arg 0 0]

When the procedure is called, the first word of arg is a nick. The script choked on nicks that contained [ or {. It was the third line that caused the problem.

The script writer could have written instead set arg [lindex $arg 0] but that also can lead to failure if the nick contains Tcl special characters.

To see why the above code is incorrect, we can read in tcl-commands.doc that the value of arg will be "the argument string". Yet lrange and lindex should be applied only to lists. Therefore, before we apply lrange or lindex, we must convert the string to a list, which we can do by using split. So to correct the first version we change that line to set arg [lrange [split $arg] 0 0]

However, that is still incorrect. The rest of the procedure expects arg to be a string, not a list, yet lrange returns a list (in this case of one element). Therefore we must use join to convert it to a string. Our final, correct, version is set arg [join [lrange [split $arg] 0 0]]

Alternatively (and, as it turns out, more neatly), we could correct the other version, i.e. set arg [lindex $arg 0] in the same way, giving set arg [lindex [split $arg] 0]

In this case we do not need join, since lindex returns the nick that the user typed, which is therefore already a string.

A point worthy of note is that if the last argument of do_dcc_tell were called args instead of arg, then the situation would be different. If a Tcl procedure has a last argument with the special name args, that argument behaves differently from arguments with any other name.

Here is another example eggdrop script. This one is correctly written.

bind pub B|B !orderfor pub_orderfor proc pub_orderfor {nick uhost hand chan rest} {

 global botnick
 set rest [split $rest]
 if {[llength $rest] < 2} {
   putnotc $nick "Syntax: !orderfor\
   <nick to order something for>\
   <what to order>"
   return 0
 }
 putchan $chan "\001ACTION sets\
 [join [lrange $rest 1 end]] in front of\
 [lindex $rest 0], compliments of $nick.\001"
 return 0

}

Note that putnotc and putchan are defined in alltools.tcl which is distributed with eggdrop.

The original version (that I downloaded from a script library) did not contain set rest [split $rest] but did contain set cmd [string tolower [lindex $rest 0]] where it is clear that we are making the mistake of applying lindex to a string.

The original downloaded script would sometimes give incorrect replies if the input contained Tcl special characters.

The second golden rule of Tcl If the script contains a command that is to be evaluated when a timer expires, it must be in the correct form.

The list command provides a convenient way of putting the command into the correct form. It inserts backslashes and braces as appropriate to cope with any Tcl special characters that may be present.

Here is a procedure where the second golden rule has been broken. It gives an autogreet to anyone who joins the channel, unless they have already received the autogreet within the previous three minutes.

bind join - * do_jn_msg proc do_jn_msg {nick uhost hand chan} {

 global botnick jn_msg_done
 if {$nick == "X" || $nick == $botnick} {
   return 0
 }
 if {[info exists jn_msg_done($nick:$chan)]} {
   return 0
 }
 set jn_msg_done($nick:$chan) 1
 timer 3 "unset jn_msg_done($nick:$chan)"
 puthelp "NOTICE $nick :Welcome to $chan"
 return 0

}

Suppose that nick has the value abc and chan has the value #room. The double quotes will cause variable substitution of $nick and $chan so that the command given to the timer will be unset jn_msg_done(abc:#room)

When the timer expires, the unset command will work with no problem.

But suppose now that instead of having the value abc, nick has the value a[b]c. The double quotes will cause variable substitution of $nick and $chan so that the command given to the timer will be unset jn_msg_done(a[b]c:#room)

When the timer expires, the Tcl interpreter will try to perform one round of substitutions on jn_msg_done(a[b]c:#room), since the first step in evaluating any command is to do one round of substitutions on the words of the command.

Therefore, the interpreter will try to do command substitution of [b]. It will give an error message to the effect that there is no command called b. Presumably if the nick were a[die]c, the die command would be executed, shutting down the bot.

To correct the script, we replace the timer command with the following.

timer 3 [list unset jn_msg_done($nick:$chan)]

If any Tcl special characters are present, list adds backslashes and/or braces as appropriate to give the correct form for a command. We do not need to know what that form is or even think about it. The unset command, now in the correct form, is passed to the timer and works with no problem whatsoever when the timer expires.

In fact, in the case of a[b]c, the list command simply inserts some extra braces. If the nick were a[b]c{, it would insert several backslash characters. But we do not need to know any of that. We simply use the list command and let the interpreter look after those details.

Whenever we wish a command to be evaluated when a timer expires, the list command can be used to put it into the correct form.

More usually in scripts that I have seen that are similar to the above example, it is $uhost that is used rather than $nick. But userhosts can contain Tcl special characters too, so exactly the same problem can arise, and can be solved in exactly the same way as above, by using the list command.

In fact, it seems better to use $uhost than to use $nick. I used $nick in the above example since that makes it easier for the reader to experiment with the script, having pasted it into a Tcl source file.

Other methods of dealing with problems that occur with Tcl special characters There are other methods that are sometimes used to avoid the problems that can occur with Tcl special characters.

E.g. some scripts filter their input with something like the following:

proc filt {data} { regsub -all -- \\\\ $data \\\\\\\\ data regsub -all -- \\\[ $data \\\\\[ data regsub -all -- \\\] $data \\\\\] data regsub -all -- \\\} $data \\\\\} data regsub -all -- \\\{ $data \\\\\{ data regsub -all -- \\\" $data \\\\\" data return $data }

Such a filter can indeed sometimes cure the Tcl special character problem that can occur with a badly written eggdrop script. But whether it will or not depends on the details of the script. It may not, or it may solve the problem only for some cases. Adding such a filter to a script can even introduce problems.

If a script chokes on Tcl special characters, then it is far better to correct the code so that the two golden rules are obeyed than to apply a kludge such as the above filter, that isn't guaranteed to work in all circumstances and might even introduce further problems.

A point worthy of note about args Putting args as the last argument of a Tcl procedure allows the use of a variable number of input arguments. But there is a potential source of confusion if the procedure is one that is called by an eggdrop bind command.

In a script that I downloaded I saw code similar to the following:

bind dcc - m2f dcc_calc_m2f proc dcc_calc_m2f {hand idx args} {

 if {[llength $args] == "1"} {

The author was no doubt thinking that if a user typed .m2f aaa bbb ccc then the value of args would be a list of three elements, aaa, bbb and ccc, and that therefore the value of [llength $args] would be 3.

In fact, the value of args would be a list of one element, that element being the string aaa bbb ccc. Whatever string the user types, the value of [llength $args] will always be 1.

Similarly, the value of [lindex $args 0] will not be the string aaa, but the string aaa bbb ccc.

The eggdrop pub and msg binds behave in the same way.

The above properties of eggdrop's bind command have been confirmed by testing with eggdrop version 1.6.6.

Peterre paperclip444 at peterre dot demon dot co dot uk (email address deliberately obscured to deter address collection by spammers)