37

The simple code here is working as expected on my machine if launched with bash :

function ⏰(){
 date
}
⏰

Could there be a problem for other people using this, or is it universal ?

I'm wondering because I've never seen anything like this in other source code for now.

Edit : There are unlimited possibilities, it can be used to quickly distinguish a function role with the usage of an emoji for example.

A for something that can modify or remove files, a if it's a work in progress, for an interactive menu...

I guess we should create a standard for all of that, but it seems to be an interesting idea.
Maybe a random line of ~5 characters can help us a lot understanding what the code is doing. (Of course we need to learn how to read them.)

More edit : I'm giving it a shot. For now, if i fold all my functions in my editor (Or cat myscript.sh|grep function) they look like this. (My unicode looks much better in geany or my terminal compared to here.)

function ⬚_1(){
function ⬚⬚_2(){
function ⬚⬚⬚__D(){
function ⬚⬚⬚⬚__X(){
function ⬚⬚⬚⬚⬚__Y(){
function ⬚⬚⬚⬚⬚⬚_❓_P(){
function ⬚⬚⬚⬚__Z(){
function ⬚⬚⬚⬚⬚_❓_U(){
function ⬚⬚⬚⬚⬚_❓_O(){

I use a strange indentation ⬚ to show how the functions are related to each other and a symbol /❓ to clearly distinguish their role. (Of course these are not my real function names, I just put a random letter at the end, but even without them we can clearly see the relationships.)

bob dylan
  • 1,832
  • 3
  • 20
  • 31
  • The _u_ letter stands for _universal_ in utf8. – Ipor Sircer Nov 27 '18 at 10:36
  • 2
    @IporSircer and it's so Unversal that one invented utf16 – Kiwy Nov 27 '18 at 10:42
  • 8
    I'd say it unsafe for retrocompatible reason, if you have to use your script on old server this could not work as bash emoji support is recent. but it's probably OK on recent Linux. – Kiwy Nov 27 '18 at 10:58
  • 18
    @Ipor no, it stands for Unicode (and the “Uni” in Unicode stands for universal). – Stephen Kitt Nov 27 '18 at 10:59
  • Related [security documentation from Unicode](https://www.unicode.org/reports/tr36/) – RubberStamp Nov 27 '18 at 11:41
  • 5
    How "universal" do you want universal to be? Works on Cygwin, with the usual UTF-8 vs. UTF-16 problems? On modern IBM z/OS system services, which still have to deal with the EBCDIC charset? On historical Unix computers which don't use 8-bit bytes as smallest unit? The POSIX restriction is there for a reason... – dirkt Nov 27 '18 at 12:11
  • 6
    The names of functions must be made up of characters from the portable character set, according to POSIX. If "universal" means "any shell", then it would not be universal in this sense. – Kusalananda Nov 27 '18 at 12:18
  • 4
    Not if you wish to be POSIX-compliant which only permits function names which match this regex: [a-zA-Z_][a-zA-Z0-9_]* – fpmurphy Nov 27 '18 at 15:29
  • 6
    If you find yourself asking whether it is safe to do in a shell script, the answer is most probably no. Heck, not even doing `echo $foo` is safe. – Matteo Italia Nov 27 '18 at 16:49
  • 1
    It does not work on my computer when i use `sh`. But i do not know how to get the version of `sh` on my computer, `sh --version` does not work. – 12431234123412341234123 Nov 28 '18 at 10:08
  • Though I'd have to wonder _why_ you'd want to do this, in a bash script or any source code. Looks like it'd be a pain to go and actually _call_ your function. There's a reason most programming is done in ASCII, after all. – Sebastian Lenartowicz Nov 28 '18 at 12:56
  • @SebastianLenartowicz, There are unlimited possibilities, it can be used to quickly distinguish a function role with the usage of an emoji. A for a function that can modify or remove files, a if it's a work in progress... Just said that on the top of my head. – bob dylan Nov 28 '18 at 13:00
  • @bobdylan: The first one is open to interpretation, and the second makes for a refactoring challenge. You might see the bomb as meaning "modify or remove files", whereas someone else might see it as "generically dangerous function". And the second one means that you'd have to find-and-replace it as soon as that call *stopped* being a work in progress. In fact, I'm reminded of a [particular TDWTF story with a similar premise](https://thedailywtf.com/articles/A-Peculiar-System). – Sebastian Lenartowicz Nov 28 '18 at 13:07
  • I use for most of my functions. –  Nov 28 '18 at 14:38
  • 1
    It could be problem if your virtual console(Ctrl+Alt+FN) not able to type or display emoji. And also Bomb emoji might dangerous because it's black icon while your terminal background also black. And next time what if you want to modify your terminal background, then you need change all the similar color emoji. – 林果皞 Nov 28 '18 at 18:41
  • It may impose additional difficulties to disabled (poor eyesight) programmers using text-to-voice interfaces. – aaguilera Dec 04 '18 at 09:30

1 Answers1

56

A useful guideline for this is the "Portable Operating System Interface" (POSIX), a family of standards that is implemented by most Unix-like systems. It is usually a good idea to limit shell scripts to features mandated by POSIX to make sure they will be usable across different shells and platforms.

According to the POSIX specification of function definitions in the "Shell Command Language":

The function is named fname; the application shall ensure that it is a name (see the Base Definitions volume of IEEE Std 1003.1-2001, Section 3.230, Name). An implementation may allow other characters in a function name as an extension.

Following the link to the definition of a "name":

In the shell command language, a word consisting solely of underscores, digits, and alphabetics from the portable character set.

That character set contains only characters between U0000 and U007E.
Therefore characters like "⏰" (U23F0) are not valid in a POSIX-compliant identifier.

Your shell might accept them, but that doesn't guarantee that others will as well.
To be able to use your script across different platforms and software versions, you should avoid using non-compliant identifiers like this.

n.st
  • 7,918
  • 4
  • 35
  • 53
  • 19
    Good rule of thumb... if your standard keyboard doesn't have a key for it... don't use it. – SnakeDoc Nov 27 '18 at 19:44
  • 6
    @SnakeDoc https://www.youtube.com/watch?v=3AtBE9BOvvk "standard" emoji keyboard ;) – Jorn Nov 27 '18 at 22:42
  • 9
    @Jorn Maybe I should have said "if you can't buy the keyboard from a normal retail store"... lol – SnakeDoc Nov 27 '18 at 22:49
  • @SnakeDoc Someone will surely then open a normal retail store selling emoji keyboards. – Dev Nov 28 '18 at 08:06
  • 4
    @SnakeDoc It's a good start - but the keyboard I am typing this on has a key for £, €, and ¬ all of which are outside the portable character set. More seriously, some colleagues have keyboards with ä, ö, ü, è, é, and ß on them. They are all letters but are not good for portable function names. – Martin Bonner supports Monica Nov 28 '18 at 09:27
  • So.... is bash itself not POSIX compliant because of this ? – bob dylan Nov 28 '18 at 10:42
  • 1
    @bobdylan `bash` is POSIX-compliant (at least in this area) as it supports all characters that it *has to* support. It is always free to support *more* if it wants to; in this particular case the POSIX specification even reminds the implementer of this freedom ("[…] implementation may allow other characters […]"). – n.st Nov 28 '18 at 11:13
  • 2
    POSIX-compliant but not POSIX-limited ? – bob dylan Nov 28 '18 at 11:18
  • 1
    @bobdylan That's a good way to phrase it. – n.st Nov 28 '18 at 11:51
  • @SnakeDoc We don't need a special keyboard, we have an Emacs mode for editing with emoji : https://github.com/iqbalansari/emacs-emojify – Alex Vong Nov 28 '18 at 17:35
  • @AlexVong oh ya, like this (from Chrome): https://i.ibb.co/8c41kMs/emoji-fail.png – SnakeDoc Nov 28 '18 at 17:40
  • @SnakeDoc Don't know why it doesn't work on chrome (it works on firefox). In any case, we can always asciify it as :) – Alex Vong Nov 28 '18 at 17:43